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The Wharton School 
University of Pennsylvania 


OPIM 102: Introduction to Decision Processes and Technologies 


Fall 1997 а 


Paul R. Kleindorfer 
Room 1328, SH-DH 
Office Hours: Mon. 3-4, Tue. 2-3 or by Appointment 


This course is an introduction to individual (Part I) and group (part II) problem solving and 
decision-making. The course grade will be based on class participation (2096), homework 
exercises (30%), and two exams, mid-term and final (50%). 


The following syllabus lists required readings. These come either from the text for the course 
(see below) or from a (small) bulkpack available from Wharton Reprographics. 


Required Text (available at the Penn Bookstore): 


P. R. Kleindorfer, H. C. Kunreuther and P. J. H. Schoemaker, Decision Sciences: An 
Integrative Perspective, Cambridge University Press, 1993. Hereafter cited as KKS. 


September 3: 


September 8: 


September 10: 


September 15: 


September 17: 


September 22: 


September 24: 


Part I: Individual Decision Processes 
Introduction to Decision Sciences, KKS: Chapter 1 
Problem Finding, KKS: Chapter 2 · 

Problem Solving, KKS Chapter 3 


A. L. Golub, Decision Analysis: An Integrated Approach. Appendix A: A 
Guide to DPL (New York: John Wiley & Sons, 1997), 189-201. | 
R. Hogarth. Judgment and Choice. Appendix A: The Rules of Probability 
(New York: Wiley, 1980), 185-192... 


DPL Case Study: Southern Electronics I & II 


Investment Decision Making 
K. Dixit and R. S. Pindyck, Investment Under Uncertainty 
Chapter 2: Developing the Concepts Through Simple Examples, 26-55. 


Pass Out Investment Exercise 


Expected Utility Theory I 

KKS, skim Chapter 4, 115-129 
KKS, read Chapter 4, 129-145 
Pass Out: Utility Theory Exercise 
Investment Exercise Due 
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The Wharton School 
University of Pennsylvania 


OPIM 102: Introduction to Decision Processes and Technologies 
MW 1:30-3/ 1201 SH-DH/Spring 1998 - 
Office Hours:-ME3s4, T 2-3 & by Appointment 
W 9 —10;3o 
Paul R. Kleindorfer 
Room 1328, SH-DH 
Tel: 215-898-5830 
Email: kleindorfer@wharton.upenn.edu 


This course is an introduction to individual (Part I) and group (part П) problem solving 
and decision-making. The course grade will be based on class participation (2094), 
homework exercises (30%), and two exams, mid-term and final (50%). 


The following syllabus lists required readings. These come either from the text for the 
course (see below) or from a (small) bulkpack available from Wharton Reprographics. 


Required Text (available at the Penn Bookstore): 


P. R. Kleindorfer, H. C. Kunreuther and P. J. H. Schoemaker, Decision Sciences: An 
Integrative Perspective, Cambridge University Press, 1993. Hereafter cited as KKS. 


Part I: Individual Decision Processes 

January 12: Introduction to Decision Sciences, KKS: Chapter 1 

January 14: Problem Finding, KKS: Chapter 2 

January 19: Problem Solving, KKS Chapter 3 

January 21: Introduction to Decision Analysis: Skim DPL Reading in Bulkpack 
R. Hogarth. Judgment and Choice. Appendix A: The Rules of 
Probability (New York: Wiley, 1980), 185-192. 

January 26: DPL Case Study: Southern Electronics I & П 

January 28: Investment Decision Making 
K. Dixit and R. S. Pindyck, Investment Under Uncertainty 


Chapter 2: Developing the Concepts Through Simple Examples, 26-55. 
Pass Out Investment Exercise 


February 2: Expected Utility Theory I 

KKS, skim Chapter 4, 115-129 

KKS, read Chapter 4, 129-145 

Pass Out: Utility Theory Exercise 

Investment Exercise Due 7 
February 4: Expected Utility Theory II 

Utility Theory Exercise Due 


February 9: Case Study: Spreadsheet Wars 
Case Study Write-Up Due 
(Not to exceed 2 pages, double-spaced, plus exhibits) 


February 11: Alternative Models of Decision Making Under Risk I 
KKS, Chapter 4: 145-158 
Pass Out: Intertemporal Utility Experiment 


February 16: Alternative Models of Decision Making Under Risk II 
KKS, Chapter 4: 168-176 


February 18: Intertemporal Choice Models 
Loewenstein and Prelec: “Anomalies in Intemporal Choice" (Bulkpack) 
S. Feldman “Why is it so Hard to Sell ‘Savings’ as a Reason for Energy 
Conservation", in W. Kempton and M. Neiman (eds.) Energy Efficiency: 
Perspectives on Individual Behavior. (Washington D.C.: American 
Council for Energy-Efficient Economy, 1987), 27-40. 
Intertemporal Utility Experiment Due 


February 23: Evaluating Prescriptive Approaches 
KKS Chapter 5 


February 25: Artificial Intelligence, Decision Support, and Expert Systems 
Holland, John H. "Using Classifier Systems to Study Adaptive Nonlinear 
Networks", in D. Stein (ed.) Complex Systems (Addison-Wesley, 1989), 
463-499. 


March 2: AI, GP and Neural Networks (continued) 
Pass Out GA Assignment 


March 4 Examination I: Individual Decision Processes 


Spring Break: March 6-March 15 


Part II: Noncooperative and Cooperative Decision Processes 
March 16: Group Problem Solving: KKS Chapter 6 


March 18: Introduction to Game Theory P 
E. Rasmusen Games and Information: An Introduction to Game Theory 
Chapter 1: The Rules of the Game (Oxford: Cambridge University Press, 
1989), 21-41. 
skim KKS, Chapter 7, Pages 241-250 


March 23: Understanding Interdependency 
T. C. Schelling, The Strategy of Conflict Chapter 4: Toward A Theory of 
Interdependent Decision (New York: Oxford University Press, 1971), 83- 
118. 


March 25: Introduction to Static Games 
R. Gibbons Game Theory for Applied Economists Chapter 1: Static 
Games of Complete Information (Princeton: Princeton University Press, 
1992), 1-29. 
Turn in PD/GA Experiment Write-up 


March 30: Static Games Continued 
R. Gibbons Game Theory for Applied Economists Chapter 1: Static 
Games of Complete Information (Princeton: Princeton University Press, 
1992), 30-53. 
Answers Due to Gibbons Ch. 1: Problems 1.2, 1.4 and 1.8 (pp. 48-50) 


April 1: Auctions 
E. Rasmusen Games and Information: An Introduction to Game Theory 
Chapter 11: Auctions (Oxford: Cambridge University Press, 1989), 245- 
258. Hand out Dynamic Games Problems 


April 6: Introduction to and Repeated Games 
E. Rasmusen Games and Information: An Introduction to Game Theory 
Chapter 4: Dynamic Games with Symmetric Information (Oxford: 
Cambridge University Press, 1989), 83-106. 


April 9: Dynamics Games Continued 
| Answers to Dynamic Games Problems Due 


April 13: Case Study: “An R&D Race" 

Prepare the following questions (to be discussed in class): 

e What are the R&D intensities that the firms would choose at the 
various stages of the race and what are the corresponding expected 
values? 

e What is the likely evolution of the race? Do you expect it to be close? 


April 15: Case Study: “The Race to Develop Human Insulin” 
Answer the following question (2-page write-up due): 
ө What light does the model shed on Eli Lilly's launching and 
subsequent management of the race to develop human insulin? 


April 20: Course Review 


Examination II (Final): Game Theory and Group Decision Making 


Bulkpack list: 


DPL: Decision Analysis Software for Microsoft Windows 

Hogarth, Appendix A: The Rules of Probability 

Case Study: Southern Electronics I & II 

Investment Under Uncertainty: Developing the Concepts Through Simple Examples 
Why is it so Hard to Sell ‘Savings’ as a Reason for Energy Conservation 27-40 
Loewenstein and Prelec: Anomolies in Intertemporal Choice 

Using Classifier Systems to Study Adaptive Nonlinear Networks 

T. Schelling, Strategy of Conflict Chapter 4, 83-118 

Games and Information Chapter 1: The Rules of the Game 
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. Games and Information Chapter 11: Auctions 245-258 

12. Games and Information Chapter 4: Dynamic Games, 86-103 

13. An R&D Race 

14. The Race to Develop Human Insulin 
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APPENDIX A 


Rules of probability* 


The main text argued that predictive judgement is most uscfully expressed in 
probabilistic form. For instance, instead of saying that there is a 'good cha nce' of, 
say, next year's sales exceeding budget, such predictions should be calibrated with 
an explicit probabilistic statement of the form ‘There is a 0.30 probability of next 
ycar's sales exceeding target.’ However, this type of statement immediately raises 
two problems: First, what is meant by probability—in this case of 0.30? Second, 
how are such probabilities assessed? A response to the first question is given 
below in the section headed ‘The "meaning" of probability’. The second question 
is the subject of Appendix B. 

A further use of probability theory referred to in the text concerned the rules 
governing the probabilities of ‘combinations’ of events. For example, how should 
a probabilistic prediction founded on a base-rate be modified by specific data? 
How does one moderate the predictive validity of a data source by considerations 
of its reliability? The rules of probability which govern these operations are the 
subject of this Appendix. Use of the word ‘rules’, however, requires some 
clarification. The principles to be enumerated below should be considered rules in 
the same manner as the rules of logic or arithmetic. You may or may not choose to 
follow them. But following them guarantees that the probabilities you estimate 
for combinations of events will be consistent with the assessment of probabilities 
of the different events of which the combinations are composed. Furthermore. 
use of the rules allows one to deduce probabilities of single events which may not 
be intuitively evident. 

However, first consider the ‘meaning’ of probability. 


| 
THE ‘MEANING' OF PROBABILITY 


The ‘meaning’ of probability has been the subject of long debate and for many the 
issucs arc still far from settled, Nonetheless, onc, and only one operational 
definition of probability is given here. 


Definition: The probability a person assigns to an event represents his or her 


* This appendix does not claim to be a complete statement of the intricacies of the rules of 
probability. И simply aims to provide some basic knowledge and principles to help the reader 
appreciate aspects of the main text. 
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subjective degree of belief that the event will occur and is expressed on a 
continuous numerical scale with end-points of 0 and 1. 


In other words, a probability is a quantified opinion. The actual scale used is, of 
course, arbitrary; however, it should be noted that there is no implication that а 
person's subjective degree of belief is arbitrary. Furthermore, because it is 
subjectiye, this does not mean that different people will necessarily differ. 

The end-points of the probability scale, 0 and 1, represent certainty. That is, in 
assigning a probability of 0 to an event a person is saying that he or she is certain 
that the event will not occur; by an assessment of 1, certainty that the event will 
occur is implied. Intermediate values represent different shades of uncertainty. 
For example, an assessment of 0.50 implies a belief that an event is as likely to 
Occur as nol. | 

For repetitive events, most people have a good intuitive feeling for probability 
based on considering the ratio of so-called ‘favourable’ to ‘possible’ occurrences, 
i.e. the number of times an event did occur divided by the number of times it could 
have occurred. Familiar gambling devices, such as tossing a coin or die, or 
observing a roulette wheel, are cases in point. From experience one ‘knows’ that in 
tossing a fair coin the chance of observing a ‘head’ on any throw is about one-half. 
Similar statements can be made, for example, about the observation of male ог 
female births. The relative frequency of past occurrences of an event is often a 
useful indicator of probability. However, itisa mistake to equate probability and 
relative frequency unless one is willing to make a su*5jective judgement that all 
'possible' cases are equally likely, i.e. probable. Hence, a subjective judgement of 
‘degree of belief is involved in estimating probabilities on the basis of observed 
relative frequencies. This issue is discussed in greater detail in Appendix B. 

Many important events are, of course, not repetitive. Consider, for example, the 
possibility of certain types of accident in nuclear power plants or an investment in 
a new industry. In these cases, subjective judgement—unaided by observation of 
past relative frequency—is necessarily the sole basis of probability. 

People often baulk at the above notions. However, they must be faced squarcly. 
The only comfort that can be given is that the expression of opinions in the form 
of subjective probabilities that conform to the rules of probability theory is 
consistent with several intuitively appealing principles of rational behaviour. In 
other words, if you behave rationally, your subjective opinions can be considered 
probabilities that conform to the rules of probability theory. 


RULES OF PROBABILITY 


Events 


Probabilities are assigned to events, for example the observation of rain at a 
certain place tomorrow. Consequently, it follows that an event must be precisely 
defined. For example, if a bet is made conditional on the occurrence of an event il 
should not be possible for someone to avoid paying the bet on account of a loose 
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definition. The event, rain tomorrow ina certain town, for instance, would have to 
be operationally defined by a given level of precipitation at a specific point where 
measurements can be taken. | 

Events belong to a class of events which make up a range of possibilitics (often 
technically known as a sample space). Furthermore, events might themselves be 
subdivided. As an example, consider assessing probabilities for the level of a 
company's sales next year. The range of possibilities is from a theorctical 
minimum of zero to some maximum value. Events can be divisions of that range, 
for example, all values in excess of budget. This ‘event’, could be further 
subdivided into smaller ranges, and at the limit to actual values expressed at the 
level of dollars and cents. 

The actual definition of events one works with in a particular problem must be 
made specific, 

In using probability theory, people often refer to the complementar y event. This 
covers all events in the range of possibilities other than the event you are 
considering. For example, the complementary event to sales exceeding budget is 
that of sales being equal to or less than budget. 

In the sequel, events will be labelled by letters, for instance A, B, E, ctc. The 
shorthand used to denote the probability of an event is given, for example, in the 
case of event A, by p(A). Events also occur in «Шеген kinds of ‘combinations’. 
The reader should therefore note the following: 


ҢА ог B) : the probability that cither A or B occurs. 
МА апа Е) : the probability that A and E occur. 
pi E) : the conditional probability of B given E (i.c, the pro- 


bability of B occurring given that E has occurred or could 
be supposed to have occurred). 


Two further points need to be made: (1) events are sometimes mutually exclusive 
which means that if one occurs the other(s) cannot. Consider, for instance, a Кое 
race with four horses, A, B, C, and D. The event of any one horse winning the race 
is mutually exclusive of the others—only one horse can win the race (excluding 
lies). By definition, an event and its complement are mutually exclusive; (2) events 
can be independent of each other. By this is meant that knowledge of the 
occurrence of one event does not affect the probability of the other, or vice versa. 
For example, the events of ‘rain today' and ‘your car breaking down tomorrow’ 
could be independent if your assessment of the probability of both events were 
the same whether or not you knew the other event had occurred. That is, you 
assess the same probability of ‘your car breaking down tomorrow’ irrespective of 
whether it did or did not rain today. Note that although in many situations one 
may use dala to assess whether two events are independent (for example, sun- 
spots and stock-market prices), in the final analysis independence is a subjective 
judgement. 
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Properties and rules of probability theory 
There are four properties: 


(1) For any event A, 0 S р(4) < I. | 
(2) If the set of all the events within a range of possibilities is denoted by S, then 


р(5) = 1. 
(3) If A and B are mutually exclusive, then 
p(A or В) = p(A) + p(B). 


p(B and C) 


4 B = 
(4) p(B|C) МС) 


From these four properties, all the rules (and ‘theorems’) of probability theory 
may be derived. Two rules, of ‘addition’ and ‘multiplication’, are particularly 


useful in calculations. 


Addition rule 
The probability of either of two events, C and D, occurring is 


РС ог D) = р(С) + р(р) — р(С and D). 


In the special case that C and D are mutually exclusive, p(C and D) = 0, and we 


have Property 3. 
The addition rule generalizes to more than two events. 


Multiplication rule 
The probability of two events, E апа F, both occurring is 
p(E and F) = p(E)p( F | E) = pL F)p(E|F). 
in other words, ‘the probability of both E and F is the probability of E multiplied 
by the probability of F given E, or the probability of F multiplied by the 


probability of E given Е". The multiplication rule also generalizes to more than 
two events, A special case of the multiplication rule occurs when E and F are 


independent. In this case, 
p(E) = p(E|F) and p(F) = p(F |E). 


That is, knowledge of F does not affect the probability of E, and knowledge of E 


189 
does not affect the probability of F. If this is the case, then 
P(E and F) = p(E)p(F). 


Examples of addition and multiplication rules 


The following example may help the reader appreciate both rules. Imagine you 
are attending a race meeting. Y ou have the opportunity of betting on two horses, 
G, and G,, running in the same race, and on two horses, H , and H,, running in 
different races. 

In the first race, what is the probability that either G, or G, wins? Second, what 
is the probability that both H, and H, win their respective races? 

The first question is a simple application of the addition rule for mutually 
exclusive events: p(G, or G4) = p(G,) + p(G,). That is, if G, or G, wins the race, no 
other horse can win. You clearly have a better chance of picking the winner if you 
can bet on both G, and G,. 

The second question can be answered by the multiplication rule: 


p(H, and H,) = p(H,)p(H,|H,) 


That is, the probability of both 11, and 11, winning is the probability that И, wins 
multiplied by the probability that H , wins given that Ну won. [If H and Н, are 
independent, note that p(H, and H,) = p(H,)p(H;).] 

It should be noted that since the probability of an event is at most 1, joint 
probabilities, e.g. p(H, and Н,) must be equal to or less than the individual 
probabilities of which they are composed. Thus the joint probability of several 
cvents is frequently a very small number (which, by the way, explains why race 
tracks can pay such large sums for naming the winners of, say, three consecutive 
races—in France, the so-called ‘tiercé’). People, incidentally, have been shown to 
overestimate systematically the joint probability of ‘several events, and to 
underestimate the probability of one of several events occurring. That is, pcople's 
unaided intuitions do not appreciate the properties of the multiplication and 
addition rules. 


Bayes' theorem | 


Property 4 provides the formula for calculating conditional probabilities and is 
known as Bayes' theorem (or rule). Recalling that 


p(B and C) = р(В)р(С|В), 
note that Property 4 can be re-written as 


р(В)р(С|В) 


сұмды 
p(B|C) xi 
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The cells of the table indicate joint probabilities, the totals in the margins 


which, if both sides are multiplied by p(C), reflects the fact that 
represent the so-called marginal probabilities, e.g. p( -- ). Note that + can occur 


p(B and С) = p(C)p(B|C) = р(В)р(С|В). 


Bayes’ theorem is particularly useful for updating so-called base-rate pro- 
babilities by specific data (cf. Chapter 3). 


X-ray problem 


The information given is that the base-rate or prior probability of being ill is | out 
of 200, or 0.005. Denote this by р(1) = 0.005. You are also told that the reliability 
of the X-ray machine can be described as follows: 


р(+ |1) = 0.95 
р(— |Т) = 0.95. 
That is, the probability that the X-ray says you are ill (+) when you are ill (1) is 


0.95; the probability of the X-ray saying you are not ill ( —) when you аге indeed 
not ill (Т) is also 0.95. 

What you are required to estimate is p(I| +), thal is the probability that you are 
sick given that the X-ray indicates you аге ill. In other words, how should the 
information from the X-ray modify your prior opinion of being ill— 
р(1) = 0.005—into a so-called posterior opinion, p(| +)? From Bayes’ theorem, 


р(1)р( + |1). 


ИД» ІҢ +) 


In the right hand side, the only quantity not provided in the problem is p( +). 
However, this can be derived by use of the addition and multiplication rules. 


Specifically, 
p(+) = pli and +)+ pl and +) 
= pll)p(+ |D + р(Тур( + |1). 


This equation is usually not intuitively evident. Consequently, Table А.І. may 
help. 


Table Al 
||| Not ill 
(1) (7) 


Test indicates 
ill (+) 






(+ and T) | p(t) 





pu) p) 


Test indicates 
well ( —) 


pi~) 





with either / or Т, thus 
p(+) = p(+ and I) +р(+ and T). 
The same table completed with numbers from the problem is given in Table A2. 


Tanta A2. 
! 


i 





0.0545 





0.9455 


0.005 0.995 


Consequently, 

р(Ї}р(+ |І) 
р( +) 

0.005 x 0.95 


ТЕ m 


~ 0,0545 
= 0.087 


р(1| +) = 





The cab problem 


The information given in the problem is as follows: 


p(G) = 0.85 | There is a base-rate or prior probability that a cab 
is Green. (Note, p(B) = 0.15). 


p(SG|G) = 0.80 When testing witnesses, 80% of Green cabs were 


said to be green (SG). Consequently, 
p(SB|G) = 0.20. 


When testing witnesses, 8077 of Blue cabs were 
said to be blue (SB). Consequently, 


p(SG|G) = 0.20. 


p(SB| B) = 0.80 


The probability that the cab involved in the accident was blue is p(B|SB). This is 


р(В)р(5В|В) 


B|SB) = 
p(B|S B) TT 
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Following the sume procedure as above, the probability to be deduced from the 
data is p(SB). Therefore, proceeding in the same manner as before, 
p(SB) = p(B and 5B)-r p(G and 58) 
= p(B)p(SB|B) + p(G)p(SB|G) 
= (0.15 x 0,80) + (0.85 x 0,20) 
== 0,29, 


Thercforc, 
p(B)p (S В| В) 
р(5 В) 


0.15 х 0.80 
0.29 


= 0,41, 


p(B|SB) = 


И is unusual for people to be able to work out the correct answers to these 
problems without prior exposure to similar kinds of problems. The best advice 
for novices is to play with tables, as illustrated for the X-ray problem, as opposed 
(о manipulating formulae. Bayes’ theorem is the normative method Гог modifying 
- base-rate opinion by specific data. 
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Why Is it So Hard to Sell 
“Savings” As A Reason 
For Energy Conservation? 


Shel Feldman 
National Analysts! 


INTRODUCTION 


In the 1982 ACEEE Summer Study, Awad and I (Feldman and 
Awad 1982) identified four attitudinal factors that relate to inten- 
tions for conserving electricity. These four factors are: 

5 lack of cynicism—the belief that conservation efforts make a 
difference and that individuals can control their wants, their 
energy use, and their energy costs; 
concern for supply needs; 


е concern for what is socially acceptable regarding electricity 
conservation (fulfilling social norms); and 
è concern for monetary savings. 


In passing, I note three things: First, we did not find any general 
conservation ethic among members of our sample. Second, percep- 
tual and attitudinal variables are linked to intentions to conserve 
electricity; the objective factors we studied are not sufficient to 
explain those intentions. Third, concern with monetary savings 
not only failed to provide a full explanation of the variance in 
intentions, but it was also not a significant predictor in the case of 
Energy Efficiency: Perspectives on Individual Behavior 
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most specific intentions considered, and it was not the major expla- 
natory variable of our analyses., a ieee coe 
Table 1, T пеон, ‘Weights for Predicting, Intentions to Conserve 
Electrieiy, by End Use Considered. . s 
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* statistically significant Ф < < 0.05 at the 95% confidence level) 


We and others connected with a project, sponsored by Southern, 
California Edison have reported additional data on attitudinal con- 
commitants of conservation intentions (Awad et al. 1983) and 
expanded on the model involved (Feldman et al. 1983; Williams 
1983).2 үр 


Expansion of the model was needed for two reasons: First, по’ 


psychological model I know of claims that behavior is a direct 
function of attitudes, except in the most extremely constrained 
laboratory situation. As Williams (1983) has suggested, following 
Fishbein and Ajzen (1975), voluntary actions depend on both 
behavioral intentions and opportunities to perform the behaviors 
at issue. In turn, behavioral intentions are affected both by prefer- 
ences and by the perceived normativeness of the behaviors at issue; 
and preferences are affected both by beliefs about the attributes of 


2 The author thanks Southern California Edison for permission to cite these 
data. The opinions expressed in this chapter are those of the author, however, and 
do not necessarily represent the opinions or policies of the Southern California 
Edison Company. 
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the behavior and attitudes relating to those attributes. Hence, the 
tion, housing, and associated life-style preferences. 

Second, for a variety of practical reasons, it is inappropriate to 
depend heavily on a strictly attitudinal model in demand-side plan- 
ning. Education toward 'énergy conservation is only one of several 
demand-side strategies; Among others are hardware-based stra- 
tegies, such as obtaining’ consumer agreement to the application of 
direct load control and promotion of energy-efficient housing and 
appliance stock; other behavioral. strategies,’ such as fostering 
changes in time of use in. order іо modify load curves; and 
economic strategies, such as overall price increases, price increases 
for selected fuels or selected times or amounts of use, and changes 
in rate structures. 


Indeed, strategies focusing on behavioral е through persua- 
sion and education suffer from particular difficulties, including the 
fact that conflicting motives constantly recur, since many changes 
are inconvenient and cause discomfort to the individual, and many 
require cooperation among various members of the family unit. 
Moreover, current. energy supplies. appear quite adequate, with 
price rises having. moderated significantly and little public attention 
being devoted to the ‘issue,’ Last—but not least--neither public 
agencies nor utility companies have a great deal of control over 
relevant sources of communication and persuasion: Compare the 
total public information budget of any utility or group of utilities 
with the amount spent on promoting a new additive to an existing 
soap in the hope of gaining two or three share points. 

One approach we and others have attempted, given the factors 
discussed above, has been to expand the range of studies of the 
determinants of energy conservation attitudes, intentions and 
relevant behaviors. In particular, we have recently attempted to 
measure the amount of investment in energy efficiency reported by 
ratepayers, both in particular technologies and overall, the amount 
of investment intended, and the willingness to purchase energy- 
efficient technologies at some premium over less efficient technolo- 
gies. 

This last variable—willingness to pay some premium for energy- 
efficient technology—has seemed rather promising, both for practi- 
cal and for theoretical reasons. Practically, if consumers can be 
induced to purchase energy-efficient housing and appliances, the 
difficulties cited earlier with regard to persuading people to engage 
repeatedly in energy-conserving behavior could be avoided. Energy 
savings could be built into our society, and they could presumably 
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be incorporated without government mandates and coercion. 
Theoretically, we would be able to hook into the pawerful choice 
models developed by economists such as Hausman (1979), treating 
attitudes and preferences as simply additional variables to be 
monetized and included in the analysis of the implicit discount 
rate. Moreover, by working with these variables, we are clearly 
making use of and expanding on the concern for monetary savings 
that was earlier identified as one of the psychological factors that 
motivates people to conserve. 


SAVINGS AND THE IMPLICIT DISCOUNT RATE 


The implicit discount rate approach is based on the postulate that 
the rational person offered several investment options prefers that 
option with the lowest life-cycle costs. Since life-cycle costs include 
first costs, operating and maintenance costs, and interest costs, it is 
simple to compare the costs of two appliances, say, whose perfor- 
mance characteristics are equal except for energy efficiency: _ 

In many cases, the first costs for the more energy-efficient unit 
are higher than those for the. less efficient unit, but the operating 
and maintenance costs of the more energy-efficient unit are lower. 
Hence, a larger initial investment would result in later savings, 
which should equal, and then exceed; the initial price difference at 
some time in the future. By determining how soon the energy- 
efficient unit must “рау: баск” the differential initial investment, 
‘ithe analyst can determine.the interest rate that the buyer implicitly 
requires on his or her initial investment. The shorter the payback 
period required, the larger the implicit discount rate, and the less 
willing the individual is to invest in energy efficiency. ' 

With this model, we can also ascertain the effects of other 
differences in performance characteristics ‘оп implicit discount 
rates, as well as.the effects of different demographic characteristics 

of ratepayers, and the effects of. different -attitudes and beliefs. 
That is, we can determine whether different.groups of ratepayers, 
such as younger and older people or homeowners and renters, 
differ in their implicit discount rates, and whether such motiva- 
tional factors as cynicism or concern for monetary savings are asso- 
ciated with different discount rates. 

The results of these, analyses could be used in various ways. For 
example; they could бе. used 40, target subsidy | programs— 
attempting to provide incentives that’ are most cost- effective to 
.. induce purchase of energy-efficient technology, to just the right peo- 
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Fig. 1. Item wording for measurement of implicit discount rate, Suppose you needed 
a new refrigerator now, and you decided to buya 19 cubic foot model. As you may know, 
more energy-efficient models cost a bit more than less energy-efficient models, but all 
refrigerators tend to last at least 10 years without repairs., If your electric company offered 
you a model that would save $25 a year in energy, costs, would you pay $85 more to 
purchase it in the first place? (*marks the end of question.) > же 


ple, in me the right amount. | | 

This model poses some difficulties, however. Ц assumes а 
rational investor, with perfect knowledge of the marketplace, costs, 
benefits, and preferences. Of course, economists and others recog- 
nize that perfect knowledge seldom exists, and a number of propo- 
sals have been made to deal with'the problem. :Among these are 
rules for labeling houses and appliances with energy cost informa- 
tion and enhancing the visibility and clarity of the labels. . 

There is at least some reason to doubt’ that most consumers do 
in fact use the ‘information presented in any comprehensive way. 
If people act: as if they are computing required payback periods or 
discount rates on' the idi ‘of рүе Costs, benefits and prefer- 
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would save $25 per year in energy costs. (See Figure | for the 
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й 20 dir. ДА ! | | Нем PH our analysis, just as ‘they affect that of Houston (1983). Something 
£ 9 is clearly wrong, however: The distribution is in no way smooth, 
т and it is not simply a matter of one study, or the numbers of 
8 19 respondents included. We find our analyses improve significantly 
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Table 2. Distribution of Discount Rates for an Energy-Efficient 
5 О Refrigerator 
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Fig. 2. Distributions of implicit discount rates for energy-efficient refrigerators. ке __ ВИРА сене | Lower _ centage Lower _ 
(Total) | (100) (97) (100) (98) 
_ $30 or less 83 8 97 8 98 
i Mp $35 71 | | 89 1 90 
ences, we might expect that observed discount rates are more or "$45 55 ta] 88 * 89 
less normally distributed across people. After all, perceptions of | $50 49.: | 87 | 89 
costs, benefits and preferences differ more or less normally across $55 44 0 86 : 88 
fe ei А 1 bine. d $60 40 0 86 " 87 
people, and their effects should be expected to combine in such a $65 37 › 86 27 
way as to create a randomly distributed implicit discount rate. - $70 34 0 84 | 86 
Various constraints in the system could be expected to skew the $75 31 | 84 2 85 
observed distribution, or to cause it to become more leptokurtic— $80 29 4 83 3 83 
that is, taller and thinner—than normal. For example, if people $85 27 И 79 L 80 
observed available money market interest rates for individual 1. x р 5 "i 
investment and set those as the minimum acceptable discount rates $100 2] 19 11 54 
for energy-efficiency investments, the distribution might be posi- $105 20 4 5 41 
tively skewed, and rather leptokurtic just above the market rate, $110 19 3 4 36 
but still a recognizable distortion of a normal distribution. 51 Mis more 17 E p: 32 
In two studies conducted for Southern California Edison (Feld- "e р E 
man et al. 1983), we asked respondents to a telephone survey to 
indicate the amount of additional money they would pay for an " Less than 0.5 percent : 
energy-efficient refrigerator that would last at least 10 years, and ! Assuming $25 return per year with no salvage value and immediate еріндегі 
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dents into those with low, medium and high discount rates such as 
those utilized by Hausman (1979). People appear to respond far 
more grossly than the implicit discount rate model seems to imply. 

Thus, while we have some reservations about the methods used 
by Houston (1983), we concur with one form of his basic conclu- 
sion: A substantial number of people do not appear to behave in 
accordance with the model underlying the implicit discount rate 
approach. In his analysis, Houston found those with lower 
incomes and less experience with energy-conserving activities less 
, likely to respond to his discount rate question. Hausman found 
| that those with lower incomes acted as if they had extremely large 
| discount rates, and Gately (1980) reported truly enormous discount 
rates for many in his sample—again calling into question whether 
the model is appropriate as a description of the behavior of the 
individuals studied. In our study, we found that implicit discount 
rates varied with the future orientation of the respondents, as well 


as with their reported experience with energy-efficiency invest- ^ 
ments. Furthermore, 28 percent to 40 percent of the responses" 


were off the ends of the scale. 

Why don't people behave in accordance with the model? Hous- 
ton suggests that many do not have the appropriate conceptual 
tools, but he does not describe what those tools are, and why some 
may lack them. We suggest that there is a basic lack of ability to 
impute value to an object and to conserve that value in the face of 
apparent changes over time. In the remainder of this paper, we 
shall endeavor to explain the psychological notion of conservation, 
indicate its relevance to economic concepts such as value, and sug- 
gest some implications of this analysis for problems in energy con- 
servation. 


CONSERVATION IN PSYCHOLOGICAL 
THEORY 


Jean Piaget, a well-known Swiss psychologist, first described an 
experiment with young children (Piaget & Inhelder 1941) that has 
since been replicated thousands of times. The adult. takes two 
lumps of clay and rolls each into a ball, adding to or removing clay 
from one, until the child agrees that each contains the same 
amount of clay. The adult then rolls one into a sausage form, as 
the child watches, and asks whether the two pieces now contain the 


same amount of clay, or one has more, or one has less than the 
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other. Typically, the child of five or six will say the two pieces are 
no longer equal. (Most often, the child seems to focus on. the 
length of the "sausage" and says it contains more clay than the 
ball.) Piaget explains the phenomenon as indicating that the child 
focuses on one salient dimension in the transformation from ball to 
sausage and, seduced or overwhelmed by the difference between 
the two pieces of clay on that one dimension, fails to recognize the 
concomitant shift in other dimensions such that the amount of clay 
has been conserved during the transformation. 


Of particular interest, the child is usually perfectly capable of 
recognizing that no clay was added or taken away during the 
transformation, perfectly capable of noting that the two pieces of 
clay are equivalent when the sausage is returned to the form of a 
ball. Indeed, it is only when the adult asks for a /ogical explana- 
tion that the child engages in any distortion of reality—asserting 
that the adult must have secretly added clay to the "sausage," for 
example. In other words, the child is not necessarily lacking in the 
logical knowledge that the clay cannot have become more or less in 
being reshaped; the child simply cannot assimilate that knowledge 
to the perceptual situation with which he or she is confronted. 
However, the child will continue to insist that there is more clay in 
one piece than another when the shapes differ; and no amount of 
logical argument or adult remonstrance, and no direct tuition, 
changes the child's mind. 

Only with additional maturity and with what Piaget calls experi- 
ence with accomodation and assimilation of various schemata does - 
the child come to "conserve" changes in mass while objects 
undergo shape transformations. , 

This demonstration is but one of a class of similar ее, 
Piaget and his followers have shown that children have difficulty 
conserving various relationships in the face of different transforma- 
tions. Among these are the physical relationships including mass, 
number, and volume, but also such non-physical relationships as 
family membership. 

Moreover, while the preschooler may have trouble with conser- 
vation in the face of direct physical transformations, older children 
encounter similar difficulties in dealing with symbolic transforma- 
tions of greater complexity. A classic example of a problem posed 
to adolescents is the following; "Edith is fairer than Suzanne. 
Edith is darker than Lilly. Who is darkest of the three—Suzanne, 
Edith, or Lilly?” Selution of this problem requires constructing the 
inverse of one of the statements given, while conserving the other 
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existing relationships in the face of the symbolic transformations.° 

According to Piaget, the structure of the problem facing the sub- 
ject of the symbolic transformation experiment is. the same, as. that 
facing the child watching ‘the manipulation of the balls, of clay; 
where perceiving the length of the “sausage” may dominate the 
ability to appreciate the reciprocity of the change in. diameter. to 
the change in length, the explicit verbalizations of “darker”. and 
“lighter” now dominate the: reciprocity of the logical relationships. 
Only the specific relationship and the level of уро abstraction 
have changed. 


What has all this to do with value, investments; sad implicit 
discount rates? Simply this: The model underlying the icomputa- 
tion and study of implicit discount rates assumes that people 
readily compute the present value of an investment and readily 
project that value into the future. We suggest that such computa- 
tions and projections are extraordinarily complex, and require a 
conservation of value that is unlikely to be in ordinary use by 
many people. 

In developing investment strategies keyed to internal rates of 
return, we assume that an individual utilizes two types of 
knowledge. The first is straightforward: the current market rate of 
interest, modified by expectations about changes in that rate over 
the effective life of the investment contemplated. But the second 
may not be as readily applied as we are prone to assume. 


Specifically, the individual must first recognize the need to pro- 
ject into the future the relationship between the cost of the goods 
or services in which investment is contemplated and his or her 
current preferences. In doing so, the individual must also recog- 
nize the dependence of that relationship on the value of money and 
on his or her preferences. But these are complex relationships, and 
ds everyone appreciates them, or even the need to understand 
them. 


MONEY AND ITS FUNCTIONS 


Economists traditionally treat money as serving two major func- 
tions. First, it serves as a medium of exchange that facilitates the 
transfer of goods and services among producers and consumers. 


? Thus, the child might construct the inverse of the first statement, obtaining 
“Suzanne is darker than Edith." Only now can she solve the problem using transi- 
tivity ("Suzanne is darker than Lilly”). 
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To the extent that money is held over time and exchange is 
deferred, it also serves. to;store value. But a full understanding of 
the use of money in. this function requires appreciation of the 
opportunity costs of saving, the origin of interest payments, and 
the relationships. between liquidity dispara and investment 
behavior. X 

In its second major. funttion, money ‘Serves , as: a standard of 
value, without which a complex society cannot function. It is only 
as price ratios are translated into common units that the value of 
the diverse products of an industrial society can be measured 
against one another, and only as money is integrated with labor, 
time, effort and skill, in the wage-unit, that various forms of work 
can be compared with one another. Moreover—and this issue is 
the critical one for our present purposes—it is only to the extent 
that the delivery of future products or future labor can be valued, 
with money serving as a standard of deferred payment, that con- 
tracts can be written and enforced, and that future. value can be 
compared with present value in terms of payback periods, implicit 
discount rates, or avoided costs. 

But there is ample evidence to suggest that many people do not 
understand how money functions as a standard of deferred pay- 
ment. In a recent survey for the Public Opinion Index, for exam- 
ple, we sought to study public perceptions of inflation. In tele- 
phone interviews, we asked 1004 persons in a national probability 
sample the rate of inflation in 1983. The median estimate among 
the 44 percent of our sample that answered the question was that 
inflation ran at 7.75 percent. The median estimate varied among 
subgroups: for example, from 7.30 percent among professionals, 
managers, and owners, through 8.24 percent among white-collar 
and sales-clerical workers, to 8.56 percent among blue-collar work- 
ers. But in none of the demographic analyses was the median 
below six percent. The actual inflation rate in 1983 was slightly 
over three percent, as measured by the consumer price index. 

We also asked members of our sample how much they would 
have to spend today to get what one dollar bought a year ago, in 
the belief that they would be more accurate on this more concrete 
sort of question. Table 3 shows that the exact opposite occurred. 
While respondents felt capable of answering the question (only 19 
percent failed to try to do so), the answers were much further from 
the true rate of inflation in 1983 than were the direct estimates of 
inflation rates. The overall median estimate was an astounding 
$1.41, Differences among subgroups were often quite striking, as 


37 


Feldma. 


Table 3. Median Estimates of Money Needed Today to Buy What 
One Dollar Bought a Year Ago 


Selected Segment mE Median 
Total Public $1.41 
Sex 
Male $1.27 
Female $1.60 
Age 
19 — 24 years of age $1.41 
25 — 34 years of age $1.36 
35 — 44 years of age $1.39 
45 — 64 years of age $1.45 
65 years of age or older $1.42 
Educational Attainment | 
High school incomplete or less | $1.53 
High school graduate $1.47 
College incomplete $1.29 


College graduate or more $1.20 


well. The medians vary by $0.33 with level of educational attain- 
ment, for example. Even among college graduates, the median was 
$1.20. 

We suggest that people are somewhat closer to reality: when 
asked the inflation rate directly because they hear or read news 
reports of official figures relatively often. Clearly they do not recall 
these figures with great accuracy, but at least their guesses are 
within an order of magnitude. However, people clearly do not 
integrate the figures they hear or those they recall with their per- 
ceptions of how inflation affects them directly—the cost of goods 
today relative to a year ago. And people clearly do not hold con- 
stant in their minds those relative costs. Most people are not using 


money as a standard of deferred payment; they are not conserving, 


the exchange value of goods they are purchasing in the face .of 
changes in the nominal value of money. Rather, like the child with 
the "sausage," they are focusing on the perceptually .arresting 
transformations that are occurring. | 


38 


?- Cycle Costs 


CONCLUSION 


If most people do so poorly in conserving value over time, 11.15 
unlikely that they will do well in projecting value. into the future. 
Implicit discount rates appear to be distributed categorically in the 
general public because they reflect the different attitudes of 
different population segments toward investing in energy efficiency, 
Avoided costs and implicit discount rates are probably not useful 
concepts for describing the behavior of the general public, however 
useful they may be for analyzing the behavior of commercial and 
industrial decision makers. We would do well to avoid building 
our models and preparing our advertising campaigns based upon 
the assumption that the energy consumer operates—or can 
operate—as a rational investor. 
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Anomalies in Intertemporal 
Choice: Evidence and 
an Interpretation 


GEORGE LOEWENSTEIN AND DRAZEN PRELEC 


Research on decision making under uncertainty has been strongly in- 
fluenced by the documentation of numerous expected utility (EU) 
anomalies—behaviors that violate the expected utility axioms. The rela- 
tive lack of progress on the closely related topic of intertemporal choice 
is partly due to the absence of an analogous set of discounted utility 
(DU) anomalies. We enumerate a set of DU anomalies analogous to 
the EU anomalies and propose a model that accounts for the anomalies, 
as well as other intertemporal choice phenomena incompatible with 
DU. We discuss implications for savings behavior, estimation of dis- 
count rates; and choice framing effects. 


2 irs introduction by Samuelson in 1937, the discounted util- 
ity model (DU) has dominated economic analyses of intertemporal 
choice. In its most restrictive form, the model states that a sequence 
of consumption levels, (c,...,c;), will be preferred to sequence 
(c... „Су ), if and only if, 


a 1, 
x 6'u(c) > x 6 'u(c,), (1) 


Pew га 
where u(c) is a concave ratio scale utility function, and 4 is the dis- 
count factor for one period. DU has been applied to such diverse 
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topics as savings behavior, labor supply, security valuation, educa- 
tion decisions, and crime. It has provided a simple, powerful frame- 
work for analyzing a broad range of economic decisions with delayed 
consequences. 

Yet, in spite of its widespread use, the DU mo.lel has not received 
substantial scrutiny—in marked contrast to the expected utility 
model for choice under uncertainty, which has been extensively criti- 
cized on empirical grounds, and which has subsequently spawned a 
great number of variant models (reviewed, e.g., by Weber and Cam- 
erer, 1988). 

Our first aim in this chapter is to remedy this imbalance by enu- 
merating the anomalous empirical findings on time preference that 
have been reported so far. Taken together, they present a challenge 
to normative theory that is at least as serious as that posed by the 
much more familiar EU anomalies. Unlike the EU violations, which 
in many cases can only be demonstrated with a clever arrangement 
of multiple choice problems (e.g., the Allais paradox), the counterex- 
amples to DU are simple, robust, and bear directly on central aspects 
of economic behavior. Our second aim is to construct (in the third 
section) a descriptive model of intertemporal choice that predicts the 
anomalous preference patterns. In formal structure, the model is 
closely related to Kahneman and Tversky's "prospect theory” (1979), 
but the interpretation and shape of the component functions are dif- 
ferent. The chapter concludes with a discussion of some additional 
implications of the model for individual behavior and market out- 
comes. 


Four Anomalies 


In this section, we present four common preference patterns that 
create difficulty for the discounted utility model. 


The Common Difference Effect 


Consider an individual who is indifferent between adding x units to 
consumption at time Г and y > x units at a later time Ғ, given а 
constant baseline consumption level (c) in all time periods: 


и(с + x)8' + и(с)6' = u(c)8' + u(c + у)". (2) 
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Dividing through by $', 
u(c + x) ~ u(c) = (ufe + y) — и(с))8' 7', (3) 


shows that preference between the two consumption adjustments 
depends only on the absolute time interval separatirg them, or 
(P — t) in the example above. This is the stationarity property, which 
plays a critical role in axiomatic derivations of the DU model (Fish- 
burn and Rubinstein, 1982; Koopmans, 1960). 

In practice, preferences between two delayed outcomes often 
switch when both delays are incremented by a given constant 
amount. An example of Thaler (1981) makes the point crisply: A 
person might prefer one apple today to two apples tomorrow, but at 
the same time prefer two apples in 51 days to one apple in 50 days. 
We will refer to this pattern as the common difference ејјес!. 

The common difference effect gives rise to dynamically inconsis- 
tent behavior, as noted first by Strotz (1955), and richly elaborated in 
the articles of the psychologist Ainslie (1975, 1985). It also implies 
that discount rates should decrease as a function of the time delay 
over which they are estimated, which has been observed in a number 
of studies, including one with real money outcomes (Horowitz, 
1988).? See Figure 5.6 for the results of Benzion, Rapoport, and Yagil 
(1989), which are representative. 


The Absolute Magnitude Effect 


Empirical studies of time preference have also found that large dollar 


amounts suffer less proportional discounting than do small ones. 


Thaler (1980), for example, reported that subjects who were on aver- 
age indifferent between receiving $15 immediately and $60 in a year, 
were also indifferent between an immediate $250 and $350 in a year, 
as well as between $3,000 now and $4,000 in a year. Similar results 
were obtained by Holcomb and Nelson (1989) with real money out- 
comes. 


The common difference effect is analogous to the common ratio effect in decision 
making under uncertainty (Kahneman and Tversky, 1979). For a discussion of similari- 
ties and differences between the EU and DU axioms, see Prelec and Loewenstein 
(1991). 

? Horowitz (1988) used a second price sealed bid auction to estimated discount rates 
for $50 "bonds" of varying maturity. Implicit discount rates were a declining function 
of time to bond maturity. 
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The Gain-Loss Asymmetry 


A closely related finding is that losses are discounted at a lower rate 
than gains. For example, subjects in a study by Loewenstein (1988c) 
were, on average, indifferent between receiving $10 immediately and 
receiving $21 in one year, and indifferent between losing $10 immedi- 
ately and losing $15 in one year. The corresponding figures for $100 
were $157 for gains and $133 for losses. Even more dramatic loss/ 
gain asymmetries were obtained by Thaler (1980), who estimated 
discount rates for gains that were three to ten times greater than 
those for losses. Several of his subjects actually exhibited negative 
discounting, in that they preferred an immediate loss over a delayed 
loss of equal value (see also, Loewenstein, 1987). 

The magnitude and gain-loss effects are problematic for DU in 
two senses. First, the predictions that DU makes are sensitive to the 
baseline consumption profile, because the baseline level at a given 
time period directly controls the marginal utility of an extra unit of 
consumption. Experimental subjects represent a diversity of baseline 
levels of consumption, yet these choice patterns are consistent over 
a wide range of income (and hence consumption) levels. This pattern 
evokes the comments of Markowitz (1952) on the Friedman-Savage 
explanation for simultaneous gambling and insurance purchases. 
Friedman and Savage argued that simultaneous gambling and insur- 
ance could be explained by a doubly inflected utility function defined 
over levels of wealth. Markowitz pointed out that no single utility 
function defined over levels of wealth could explain why people at 
vastly different levels of wealth engage in both activities; a function 
that predicted simultaneous gambling and insuring for people at one 
wealth level would make counterintuitive predictions for people at 
other wealth levels. 

second, even the determinate predictions that DU yields, on the 
assumption that the baseline consumption level is constant across 
time periods, are not entirely consistent with the effects just de- 
scribed. Note first that the present value of a consumption change at 
time t, from c to c + x, can be measured in two ways, either by 
assessing the equivalent present value q(x,!) defined implicitly by: 


u(c + а) + 5'и(с) = қс) + би (с + x), (4) 


or by assessing the compensating present value p(x,t) that would ex- 
actly balance the change at time t: 


u(c — p) + б'и(с + x) = ufc) + d'ulc). (5) 
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(These are also referred to as the methods of equivalent and compensat- 
ing variation.) 

The gain-loss asymmetry is obtained by comparing the equivalent 
variation ratios (q/x) for positive and negative x; here, the DU model 
makes the correct qualitative prediction, as the following simple cal- 
culation shows: 


q(x,t) = u^ (1 — 8)u(c) + 8'u(c + x)) — c [solving from Equation (4)] 
< (1 — 6')c + (с + x) — c (by concavity of u(x)) (6) 
zB. 


Consequently the ratio, q(x,t)/x, is smaller than 5' for positive x, and 
greater than 5' for negative x, which is consistent with the observed 
greater relative discounting of gains. 

The critical weakness of this explanation lies in the prediction it 


'makes about the size of the gain-loss asymmetry at different absolute 


magnitudes. The normative explanation is driven by the global con- 
cavity of the utility function, which creates a gap (analogous to a 
risk premium) between time discounting and the pure rate of time 
preference. Since the utility function is approximately linear for small 
intervals (c — x,c + x), the gain-loss asymmetry should disappear 
for small x. Indeed, in the limit as x goes to zero (from either side) 
the predicted devaluation ratio, q/x, will approach the discount factor 
8, for both gains and losses. In practice, however, we observe the 
exact opposite, with the gain-loss asymmetry being most pronounced 
for small outcomes (Thaler, 1980; Benzion, Rapoport, and Yagil, 
1989). 

With regard to the magnitude effect, the DU predictions hinge 
partly on the method of elicitation. When present values are assessed 
by the equivalent variation method, DU contradicts the magnitude 
effect. For compensating variation, DU predicts the effect when x is 
negative, but predicts the exact opposite (i.e., smaller discounting of 
small amounts) for positive x. We now derive this last result as an 
illustration; the argument in the other cases is similar. 

Suppose that p is the most one would be willing to pay now in 
order to receive x > 0 at time |, as in Equation (5), and consider what 
happens as both p and x are increased by a common factor, а > 1; 


Ін (и(с - ap) + S'u(c + ох) - (u(c) + 6 'u(c))} [from Equation (5)] 


= —pu'(c — p) + 8'u'(c + x) (7) 
> 0 (if the magnitude effect holds). 


| 


Ll 
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After we substitute for 6' from Equation (5), this inequality reduces 
to: 


pu'(c — р)(и(с + x) — u(c)) € xu'(c + x) ((u(c) — u(c — p)). (8) 


But, because u(c) is concave, we have u(c + x) — u(c) > xu'(c + x) 
and u(c) — u(c — p) € pu'(c — p), which are jointly incompatible with 
the stated inequality in Equation (8). 


The Delay-Speedup Asymmetery 


A recent study by Loewenstein (1988a) has documented a fourth 
anomaly, consisting of an asymmetric preference between speeding 
up and delaying consumption. In general, the amount required to 
compensate for delaying a (real) reward by a given interval, from t 
to | + s, was from two to four times greater than the amount subjects 
were willing to sacrifice to speed consumption up by the same inter- 
val, that is, from | + s to f. Because the two pairs of choices are 
actually different representations of the same underlying pair of op- 
tions, the results constitute a classic framing effect, which is inconsis- 
tent with any normative theory, including DU. 


A Behavioral Model of Intertemporal Choice 


This section presents a model of intertemporal choice that accounts 
for the anomalies just enumerated. Our model assumes that intertem- 
poral choice is defined with respect to deviations from an anticipated 
status quo (or “reference’’) consumption plan; this is in explicit con- 
trast to the DU assumption that people integrate new consumption 
alternatives with existing plans before making a choice. The objects 
of choice, then, are sequences of dated adjustments to consumption 
(xil); i = 1,...,п), which we will refer to as temporal prospects. 

As in the prospect theory for risky choice, we will represent prefer- 
ence by a doubly separable formula [Equation (9) later], which rests 
on three qualitative properties (see appendix in Kahneman and Tver- 
sky, [1979], for details). The first property, also invoked by DU, is 
that preferences over prospects are intertemporally separable (De- 
breu, 1959), and can, therefore, be represented by an additive utility 
function, 2; u(x,t). This important assumption is psychologically 
most questionable when the choice is perceived to be between com- 
plete alternative sequences of outcomes, for example, savings plans, 
or multiyear salary contracts. In these cases, it appears that people 
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care about global sequence properties, most notably whether the 
sequence improves over time (Loewenstein and Prelec, 1991; 
Loewenstein and Sicherman, 1991). The present model is primarily 
concerned with explaining elementary types of intertemporal choices, 
involving no more than two or three distinct dated outcomes. 

In the absence of any strong contrary evidence, we assume that x 
and | are separable within a single outcome, so that u(x,t) equals 
F(v(x)b(t)), where v(x) is a value function, Ф(В a discount function, and 
F an arbitrary monotonically increasing transformation. To eliminate 
Е, one imposes а distributivity condition: (x,!) is indifferent to 
(x,t';x,t"), implies (y,t) is indifferent to (у, ;y, t"), for any outcome y, 
which essentially states that the equality: b(f) = (f°) + Ф(Ғ), can 
be established with any one outcome (Kahneman and Tyersky, 1979, 
p. 290). The discount function is then uniquely specified, given the 
standard normalization (0) = 1. The final model represents prefer- 
ence by the formula: 


хажы) = У rx). (9) 


The remainder of this section specifies the properties of the two com- 
ponent functions and shows how the model accounts for the anoma- 
lies presented in the second section. 


Discount Function 


The common difference effect reveals that people are more sensitive 
to a given time delay if it occurs earlier rather than later. Specifically, 
if a person is indifferent between receiving x > 0 immediately, and 
y > x at some later time s, then he or she will strictly prefer the better 
outcome if both outcomes are postponed by a common amount, f: 

| 


v(x) = v(y)ó(s), implies: о(х)ф() < ОТ + s). (10) 


In order to maintain indifference, the later larger outcome would 
have to be delayed by some interval s' greater than s. To account for 
this phenomenon, Ainslie (1975) proposed the discount function, (t) 
= Ш, which had been found to explain a large body of data on 
animal time discounting. We now derive a more general functional 
form, by postulating that the delay that compensates for the larger 
outcome is a linear function of the time to the smaller, earlier outcome 
(holding fixed the two outcomes x and y): 


v(x) = v(y)b(s), implies: v(x)b(t) = о(у)Ф(КЕ + s), (11) 


| 
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for some constant k, which, of course depends on x and y. One can 
think of this as a more general form of stationarity, in which the 
“clocks” for the two outcomes being compared run at different 
speeds. In the normative case, the clocks are identical and k = 1, 
which yields the exponential discount function (Fishburn and Rubin- 
stein, 1982). From Equation (11), it follows that: 


v(x)b(") = о(у)ф(КЕ + s), (12) 


ОФК (1 — АР) 


| 


v(y)b(k(At + (1 — АР) + s). (13) 
v(y)p(A(kt + s) + (1 — (КР + s)) 
v[y)b(Ad~ !(v(x)b(t)/e(y)) 

(1 — Аф "Қо(х)Ф("/ә(у)), 


+ 


after substituting for (kt + s) and (kt + s) from Equations (11) and 
(12). Letting = v(x)/v(y), w = ф(ђ, z = Ф(Р), and и = $^! produces 
a functional equation, 


ки“ ЦХи(то) + (1 — X)u(z)) = u^'(Au(riw) + (1 — A)u(rzD, (14) 


whose only solutions are the logarithmic and power functions: u(t) 
= cln(t) + d, u(t) = се + d [Aczel, 1966; p. 152, Equation (18)]. As 
b(t) = u^ (f), the discount function must be either exponential ог 
hyperbolic: 


(D1) The discount function is a generalized hyperbola: 
b(t) = (1 + at) "аи, B > 0, (15) 


The a-coefficient determines how much the function departs from 
constant discounting; the limiting case, as « goes to zero, is the expo- 
nential discount function, p(t) = e^'. Figure 5.1 displays the hyper- 
bolic function for three different values of a, along with the pure 
exponential that is the least convex of the four lines. For each level 
of a, a corresponding В is selected so that the discount function has 
value За! = 1. When a is very large, the hyperbola approximates 
a step function, with value one at t = 0, and value .3 (in this case) 
at all other times. This would produce dichotomous time preferences, 
in which the present outcome has unit weight, and all future events 
are discounted by a common constant. 

As noted already, Equation (15) satisfies the empirical "matching 
law," which integrates a large body of experimental findings per- 
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f(t) 





а. = 1,000,000 

0.2 
a=5 

0.1 
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| 
Figure 5.1. The hyperbolic discount function, 
b(t) = (1 + ar) P", 


for the three different levels of a. All B's are adjusted so that curves cross at 
Ф(1) = .3. The most steeply sloped curve represents conventional exponen- 
tial discounting. 


taining to animal time discounting (Chung and Herrnstein, 1967); the 
special case, (1 + at)~', was proposed initially by Herrnstein (1981) 
and further investigated by Mazur (1987); the general hyperbola was 
defined by Harvey (1968) and given an axiomatic derivation by Prelec 
(1989) along the lines presented here. 


Value Function 


A distinguishing feature of the current model is the replacement of 
the utility function with a value function with a reference point, as 


10 
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merr< 





$ CHANGE RELATIVE TO STATUS QUO 


gue 5.2. A value function satisfying the three conditions described in the 
ext. 


shown in Figure 5.2. The value function is pieced together from two 
independent segments, one for losses and one for gains, which con- 
nect at the reference point. Such functions have previously been ap- 
plied to decision making under uncertainty (Kahneman and Tversky, 
1979), consumer choice (Thaler, 1980), negotiations (Bazerman, 1984), 
and financial economics (Shefrin and Statman, 1984). The shape and 
reference point assumption reflect basic psychophysical considera- 
tions—extra attention to negative aspects of the environment, de- 
creasing sensitivity to increments in stimuli of increasing magnitude, 
and cognitive limitations. 

It is assumed that the reference level represents the status quo 
(i.e., the current level of consumption), and that new consumption 
alternatives are evaluated without consideration of existing plans. In 
certain cases, however, the reference point may deviate from the 
status quo to reflect psychological considerations such as social com- 
parison (Duesenberry, 1949), or the effect of past consumption that 
sets a standard for the present (Ferson and Constantinides, 1988; 
Pollak, 1970). 

The function in Figure 5.2 is representative of a class of functions 
that is consistent with the behavioral evidence presented earlier in 
the second section. The first and most elementary assumption built 
into the figure is loss aversion (Tversky and Kahneman, 1990): 


(VI) The value function for losses is steeper than the value function for 
gains: 


u(x) < — wv(—x), 
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which means that the loss in value associated with a given monetary 
loss exceeds the gain in value produced by a monetary gain of the 
same absolute size. In this respect, our value function resembles the 
prospect theory value function (Kahneman and Tversky, 1979), 
which also places greater weight on losses. 

In the context of intertemporal choice, loss aversion specifically 
penalizes intertemporal exchanges that are framed in compensating 
variation terms, that is, as the incurring of a loss now in exchange 
for a future gain, or enjoying a current pain in return for a fulure 
loss. For instance, a person who is indifferent between receiving 4-4 
now, or +x at some later date, would nevertheless not be willing to 
рау д now in order to receive + x at the later date, because the value 
of —9 is greater in absolute magnitude that the value of +q. 

The remaining two constraints on v(x) are geometrically more sub- 
tle, and have not been explicitly discussed in the context of prospect 
theory. Both constraints pertain to the elasticity of v(x), 





_ дора) _ хо) 
e,Q) = jlog(x) | u(x) ` (16) 


Our second assumption about the value function is behaviorally de- 
termined by the gain-loss asymmetry. 


(V2) The value function for losses is more elastic than the value function 
for gains: 


e(x) «e,(—x), forx9Q. 


Suppose that +q is the equivalent present value of + x at time !, so 
that, v(q) = Ф(І)г(х). The gain-loss asymmetry then implies that one 
would prefer to pay -q now instead of —x at time t: e(—1) > 
o(t)u( — x). Equating e(t) in both of these expressions shows that: 


о) 0-9) for all: 0 са < x. (17) 


v(x) v(x 


Consequently, v(x) must “bend over” faster than v( — x), in the pre- 
cise sense captured by condition V2.? 

Our third and final assumption about v(x) is dictated by the magni- 
tude effect, in equivalent variation choices. If +q is the equivalent 


PLet u(x) = = In[v(x)), and u(x) = -Inf - v( — x)). Then (17) implies that мух) — ни) 
< m(x) ~ uj(8), for all 0 < 8 < x, or: ну (х) < и; (х), for all x > 0, which is equivalent 
to Condition V2. 


oz 
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present value for x at time 1, v(q) = d(I)v(x), then the magnitude 
effect predicts that a proportional increase in both q and x, to aq and 
ax, will cause preference to tip in favor of the later positive outcome, 
ron < Ф(һо(ох). As іп the previous paragraph, by eliminating p(t), 
we have, 


v(q) < 2(о4) ога: 0 Са < ха > 1. (18) 


v(x) т(ох) 


The value function is subproportional, like the probability weighting 
function in prospect theory, As Kahneman and Tversky remarked 
(1979, p. 282), such a function is convex in log-log coordinates, which 
for our model means that the derivative of log (v(x)) with respect to 
log (x) is increasing, or that: 


(V3) The value function is more elastic for outcomes that are larger in 
absolute magnitude. | 


e,(x) < €,(y), for: 0 € x € y, or, y € x « 0, 


The implications of this condition can be visually assessed by compar- 
ing Figures 5.2 and 5.3. Both figures show the same value function, 
but plotted over a small (Figure 5.3) or a large range of outcomes 
(Figure 5.2), For small outcomes, the function is sharply convex, indi- 


mcr 
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Figure 5.3, Тһе same value function as in Figure 5.2, but plotted over a 
smaller range of outcomes. 
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cating that there is not much perceived value difference between, say 
a $1 gain and a $2 gain. This property accounts for the high discount 
rates that apply to small outcomes (i.e., in a choice between $1 now 
or $2 in a year). For large outcomes, however, the function straight- 
ens out considerably (Figure 5.2), and, as a result, generates much 
lower discount rates. 

Most probably, the elasticity of the value function does not in- 
crease indefinitely, but rather attains a maximum at some large dollar 
amount and then begins to decline. When one is comparing large 
and unexpected windfalls, it may be reasonable to prefer a million 
today to several million a few years hence—if drawing on the money 
in advance was completely prevented. The implausibility of this last 
requirement makes the interpretation of stated preference over large 
amounts problematic. 


Further Implications of the Model 


Aversion to Intertemporal Tradeoffs 


It follows from our model that a single individual will reveal not one 
but several discount factors for future cash outcomes, depending on 
how the choice is formulated. These discount factors can be geometri- 
cally derived, as in Figure 5.4, where we have overlaid the positive 
and negative branches of the value function, so that positive and 
negative outcomes can be represented along the positive x axis. 5tart- 
ing with a delayed outcome of absolute magnitude x, and a time 
interval yielding a discount factor for utility of .8, we can generate four 
distinct "present values” for x, depending on whether x is positive or 
negative, and whether the elicitation method is equivalent or com- 
pensating variation. Each present value, divided by x, then yields a 
specific discount factor. l 

From equivalent variation, v(q) = ¢(t)v(x), we get the discount 
factors for gains (G) and losses (L), 


да 9 Helv) 
Е x 


X 


óc, = ‚ (65 for x > 0, 8, for x < 0), (19) 


while from compensating variation, (v(p) + ф(До(х) = 0, we have 
the borrowing (B) and saving (5) factors, 


Os n - 


= гы 


са = 
= > (Фар (бе for x > 0, 5, for x < 0). (20) 
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Figure 5.4. Relationship among discount factors for compensating and 
equivalent variation. | 


It is apparent from the geometry of the gain and loss value functions 
in Figure 5.4 that these discount factors are ordered as: 6; < ё. < $, 
< Op. 

A notable aspect of the ranking is the large gap between the sav- 
ings and borrowing discount factors: A person whose choices are 
consistent with the value functions in Figure 5.2 would require a 
much more favorable rate in order to borrow than he or she would 
to save. Тһе gap between бр and 6, is a measure of how averse а 
person is to borrowing and savings commitments generally, because 
it implies a range of risk-free interest rates at which a person will be 
unwilling to either save or borrow. 

The existence of this gap was confirmed by Horowitz (1988), who 
elicited present and future values for real money payoffs, through 
a "first-rejected price" auction. According to Horowitz, “Тһе most 
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striking feature of [the] experiment is individuals' apparent aversion 
to both borrowing and lending." A substantial fraction of subjects 
revealed discount factors greater than one for borrowing (i.e., they 
refused zero-interest loans); this, too, is consistent with the model, 
as we can see from the fact that 6, > 1 in Figure 5.4. 


Framing Effects 


As in prospect theory (Kahneman and Tversky, 1979), we assume 
that the reference level is sensitive to the wording of the questions 
that elicit the intertemporal tradeoffs. For instance, direct choices 
between two losses, or two gains, are presumed to be likewise en- 
coded (or “framed”) as a pair of positive or negative values. The same 
would be true of requests for present amounts that create subjective 
indifference with respect to some future amount of the same sign. In 
such a context, we would interpret the elicited present value, q, for 
amount x at time f, according to the equivalent variation formula: v(q) 
= Фоа). 

Questions involving delay or speedup of consumption are a clear 
case where the compenating variation formula is appropriate. A re- 
quest, for example, for the maximum value that one would be willing 
to sacrifice in order to speedup some positive amount (x) from time 
I to the present suggests that the baseline levels are zero now, and 
* x at the future time. In this frame, the speed-up constitutes a loss 
of x at time t, and a gain of x minus the speedup cost at time zero. 
The latter value, p, would then be interpreted according to the com- 
pensating variation formula, v(p) + Ф(би(х) = 0, with x < 0 and р 
> 0. The same frame covers delay-of-loss judgments, because in that 
case, there is again a positive present benefit (avoiding the immediate 
loss), and a future cost (absorbing the loss at the later date). The two 
complementary question formats—delaying a gain, and speeding up 
а loss— would yield present values also consistent with Equation (20) 
but for a reversal in the sign of p and x, because there is a negative 
adjustment to current consumption (p « 0), and a positive adjustment 
to future consumption (x > 0). 

Figure 5.5 compares these predictions with those of the normative 
model, in which the distinction between a speed-up or delay is not 
recognized. As indicated in the top half of the figure, the discount 
rates estimated from expediting and delaying gains should be equal, 
and higher than the devaluation rates estimated from expediting and 
delaying losses. In contrast, the reference point model predicts that 
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Figure 5.5. Discount rates when expediting and speeding up gains and 
losses: comparison of DU and reference point model predictions. 


common rates will be observed for the diagonal pairs in the matrix, 
with the delaying gains/speeding-up losses pair producing a higher 
estimate. 

Clear support for the reference point model can be found in the 
data reported by Benzion, Rapaport, and Yagil (1989). Figure 5.6 dis- 
plays implicit discount rates calculated from their data for each of the 
four elicitation methods. As predicted, discount rates are high and 
virtually identical for expediting a loss (white diamonds) and de- 
laying a gain (black squares), and lower and again virtually identical 
for expediting a gain (black triangles) and delaying a loss (white 
squares). 

Our second framing example is produced by the discrepancy be- 
tween discounting of gains and losses. In this study, 85 students in 
an MBA class on decision making were randomly divided into two 
groups that each answered one of the following two questions. 
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Figure 5.6. Implicit discount rates from Benzion, Rapoport, and Yagil 
(1989); the rates have been averaged across the four dollar amounts used in 
their study. 


Version 1 


Suppose you bought a TV on a special installment plan. The plan 
calls for two payments; one this week and one in 6 months. You have 
two options for paying (circle the one that you would choose): 


A. An initial payment of $160 and a later payment of $110. 
B. An initial payment of $115 and a later payment of $160. 


Version 2 


Suppose you bought a TV on a special installment plan. The plan 
calls for two payments of $200; one this week and one in 6 months. 
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- Happily, however, the company has announced a sale that applies 
retroactively to your purchase. You have two options (circle the one 
that you would choose): 


C. А rebate of $40 on the initial payment and a rebate of $90 on the 
later payment. 

D. A rebate of $85 on the initial payment and a rebate of $40 on the 
later payment. 


Because options A and C and options B and D are the same in 
terms of payoffs and delivery times, DU predicts that there will be 
no systematic difference in responses to the two versions. Neverthe- 
less, a higher fraction of subjects opted for the lower-discount option 
(the one involving greater earlier payments) when the question was 
framed as a loss rather than as a gain. Fifty-four percent of subjects 
exposed to version 1 stated a preference for A over B. However, a 
significantly different fraction (33 percent) preferred C over D (X*(1) 
= 3.9, p < .05). The proposed model explains the observed pattern 
of responses as follows: In the first frame, the large, negative out- 
comes suffer less discounting, which causes people to decide on the 
basis of total payments. In the second frame, however, the outcomes 
are smaller in absolute magnitude and positive; both of these factors 
contribute to relatively high discounting of the delayed outcomes, 
leading to a preference for the second option, which offers a greater 
initial rebate. 

The choice of appropriate frame is not always unambiguous. A 
savings decision, for example, can be viewed as a simple choice be- 
tween benefits enjoyed now or later [Equation (19)], or a postpone- 
ment of present consumption for the future [Equation (20)]. Such 
changes in frame will, according to our theory, affect the range of 
interest rates that a person considers acceptable. 


Effect of Prior Expectations on Choice 


Consider two people waiting for an object (e.g., a computer); one 
has been told to expect delivery in 2 weeks, the other anticipates 
delivery in 4 weeks. When 2 weeks pass, both are faced with a new 
choice: the original computer to be delivered immediately, or a supe- 
rior computer to be delivered in 2 weeks. Who is more likely to wait? 
If both parties adapt their reference points to anticipated delivery 
times, then the reference point model predicts that the person who 
anticipated delivery in 2 weeks will be more impatient. This person 
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frames the choice as the status quo versus a loss of a computer imme- 
diately and a gain of a slightly superior computer in 2 weeks. Loss 
aversion and discounting both mitigate against choice of the delayed, 
superior, model. On the other hand, the person who anticipated later 
delivery would frame the choice as a loss of the later computer and 
gain of an earlier computer; here, loss aversion discourges the choice 
of the earlier computer, while discounting has an opposing influence. 
Thus, we predict that the person anticipating 2-week delivery would 
be more likely to accept. In effect, people who are psychologically 
prepared for delay are more willing to wait. 

This prediction was tested in a laboratory experiment conducted 
with 105 suburban Chicago tenth graders (Loewenstein, 1988b). All 
prizes were in the form of nontransferable gift certificates. As a result 
of an earlier experiment, half the students expected to obtain a $7 
gift certificate at an earlier date, half at a later date. When the earlier 
date arrived, all subjects were given a new choice between getting 
the $7 certificate immediately or a larger valued certificate at a later 
date. As predicted, prior expectations had a significant impact on 
choice. Twenty-seven out of 47 subjects who anticipated getting the 
prize at the earlier date opted for the immediate $7; only 17 of the 57 
who expected late delivery chose not to wait for the larger prize, a 
statistically significant difference. 


High Discount Rates Estimated from Purchases 
of Consumer Durables 


Several studies have estimated discount rates from purchases of con- 
sumer durables (e.p., ат conditioners) (Gately, 1980; Hausman, 
1979). Such purchases typically involve an up-front charge (the pur- 
chase price) and a series of delayed charges (e.g., electricity charges). 
Because more expensive models are generally more energy effiaient, 
it is possible to calculate the discount rate (or range of discount rates) 
implicit in a particular purchase. A second source of behavioral esti- 
mates of discount rates have been the studies of major economic 
decisions such as saving (Landsberger, 1971) and intertemporal 
labor-leisure substitution (Hotz, Kydland and Sedlacek, 1988; Moore 
and Viscusi, 1988). 

The estimates from these two classes of studies have differed 
sharply. Studies of consumer durable purchases show very high aver- 
age discount rates (across different income groups), for example, 
from 25 percent (Hausman, 1979) to 45-300 percent (Gately, 1980). 
Research on savings behavior or labor supply has almost uniformly 
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found much lower discount rates (typically well below 25 percent). 
How can these estimates be reconciled? The proposed model predicts 
that the small delayed electricity charges associated with the con- 
sumer durables will be substantially devalued because of the depen- 
dence of discounting on outcome magnitude. Thus, consumer dura- 
ble purchases will be insensitive to electricity charges, and discount 
rates estimated from those purchases will appear to be high. Discount 
rates estimated from major economic decisions would not be subject 
to such small-magnitude effects. 


Nonmonotonic Optimal Benefit Plans 


Our model makes certain predictions about the shape of the optimal 
intertemporal allocation of benefits under a constant market present 
value constraint. If one assumes that consumption at a point in time, 
x(t), is framed as a positive quantity, the value of the plan, covering 
the period from 0 to T, is given by the continuous version of the 
discounted value formula, 


Т 
| b(t)o(x(t))dt. (21) 


The optimal plan x(t), given a market interest rate r and a present 
value constraint, 


T 
| e"x(Ddt = 1, (22) 
0 


сап be calculated by standard techniques (Yaari, 1964). Yaari showed 
that if the optimal plan exists, and if the value function is concave 
and continuously differentiable, then the rate of change in consump- 
tion, for the optimal plan, equals [Equation (21)]: 


ad — -*«(-vem) 
a! | КОЈА бо) / en 


As Yaari observed, the direction of local change in consumption rate | 


is controlled by the sign of the difference between the market interest 
rate and the rate of time preference (—ф'/ф). In view of our hyper- 
bolic discounting assumption, this allows for only three qualitatively 
distinct possibilities: (1) The rate of time preference is always greater 
than the market rate, in which case consumption is decreasing 
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throughout the interval; (2) the rate of time preference is always lower 
than the market rate, in which case consumption increases over the 
interval; and (3) the rate of time preference starts off above the market 
rate, but eventually drops below it and remains so, in which case 
consumption will decline to a minimum value (when the two rates 
equalize) and then increase afterwards. 

Relative to normative theory, our model suggests that people may 
tend to prefer plans that sacrifice the medium-range future for the 
sake of the short- and the long-term future. There is nothing clearly 
wrong with this, provided one can commit to an entire plan at the 
moment of decision; if, however, the optimal plan can be recalculated 
at later points in time, then the planned sacrifice in midrange con- 
sumption will not take effect (Strotz, 1955). As a result, a bias in favor 
of the long and short runs may in practice yield behavior that is only 
oriented to the short run. 

This discussion presupposes a concave value function, which— 
although not explicitly assumed in assumptions V1-V3 —is certainly 
true for the function in Figure 5.4. In the loss domain, however, our 
working assumption is that the value function is convex, at least 
initially, which means that the most attractive plan for intertemporal 
loss allocation consists of concentrating the loss at a single point in 
time. The (negative) value of the loss, if allowed to accumulate at the 
market rate to time t, equals: ¢(f)v(le"), which means that it will pay 
to delay payment whenever, ФВәйеу + "етф(Вә (е) > 0, or, after 
rearranging, whenever: 


= $ (0/Ф0) | 
pe СШ) (24) 


The product on the right is decreasing in tł, because - ф'/ф equals В/ 
(1 + at) by assumption D1, and e,(Ie") is increasing by assumption 
V3. Hence, there is a unique point in time—possibly at one or the 
other endpoint of the interval—at which the loss is absorbed with 


smallest perceived cost. 


Other Predictions 


Our model has several implications for the behavior of key economic 
variables during business cycles. First, it predicts that psychological 
factors will amplify the tendency for businesses to cut back on invest- 
ment during periods of lower than anticipated profits. In high-profit 
periods, the investment project is viewed in terms of equivalent varia- 
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tion, as a choice between two gains: Take the excess profit now, or 
take greater profits from investment later. But in periods of low or 
negative returns, an identical investment opportunity would be 
viewed in terms of compensating variation, that is, as incurring a 
current loss in exchance for a future gain, which, as shown in the 
previous section, will induce a higher subjective discount rate. There 
may, of course, be good economic reasons for reducing investments 
during economic downturns; what the model suggests is that psycho- 
logical factors additionally and independently contribute to the re- 
duction. 

For consumers too, an economic downturn should cause an in- 
crease in impatience and a consequent decrease in saving. Consum- 
ers are likely to frame drops in disposable income, or negative depar- 
tures from expected gains, as losses, so that saving from income will 
be viewed in terms of compensating variation: a further loss in the 
present for a gain in the future.* Saving out of an expanding income 
or out of bonus income is more likely to be viewed in terms of com- 
pensating variation, including lower discounting and greater saving. 
Consistent with this prediction, there is evidence that the marginal 
propensity to save income from bonuses is higher than that from 
normal income (Ishikawa and Ueda, 1984). 

Our model is also possibly relevant to the so-called disposition 
effect in real estate (Case and Shiller, 1989) and financial markets 
(Ferris, Haugen, and Makhija, 1988; Shefrin and Statman, 1985). This 
effect refers to the fact that people tend to hold on to losing stocks 
and to real estate that has dropped in value, which depresses trading 
volume during market downturns. In such situations people have a 
choice between taking an immediate loss (bv selling) or holding on 
to the asset with the potential of further loss or potential gain. Be- 
cause the value function is convex in the loss domain, further losses 
are less than proportionately painful, while gains yield marginally 
increasing returns. The incentives are thus stacked in favor of holding 
on to the asset. The incentives are reversed on the gain side, motiva- 
ting people to sell quickly assets that have gained in value. 

In general, the market level implications of the model depend criti- 
cally on the presence or absence of arbitrage opportunities that exist 
in a particular economic domain. Arbitrage opportunities are exten- 


‘The low rates of savings and negative real rates of interest іп the 1970s (Mishkin, 
1981), may reflect the shortfall from expectations induced bv economic stagnation 
following the prolonged economic boom of the 1960s. At a social level, the tax cuts of 
the early 1980s, which entailed a transfer of income from the future to the present, 
can be interpreted similarly. 
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sive in some markets, such as those for fixed rate financial assets 
where leveraged short sales are possible. In other markets, for exam- 
ple, labor markets, arbitrage opportunities are virtually nonexistent. 
We would expect to see the effects of subjective time discounting 
manifested more clearly in the latter markets, in the specific case 
through labor contracts that offer large initial wage increases. 

In financial markets, the effects of scale and sign, produced by the 
curvature of the value function, will presumably be arbitraged away. 
If a particular market were to offer high interest rates on small invest- 
ments, reflecting the magnitude effect, investors would simply bor- 
row large sums and then invest them in small packages, driving 
down the rate on small investments. 

Hyperbolic discounting is less easily arbitraged, even in financial 
markets. If most people demanded lower rates of return for long 
investment periods than for short ones, the yield curve could be 
downward sloping with no opportunities for arbitrage. Those who 
discounted the future at a constant rate would tend to invest in 
short-term securities, and might even short the long-term securities, 
but they could not do so without risk. Without denying that many 
purely economic factors influence the yield curve, our model suggests 
that psychological biases will independently exert pressure toward 
downward sloping.? 


Concluding Remarks 


The discounted utility model has played a dominant role in economic 
analyses of intertemporal choice. Although economists have experi- 
mented with alternative formulations, these efforts have typically re- 
sponded to a single limitation of DU (e.g., increasing consumption 
postretirement) rather than to a more comprehensive critique., DU's 
basic assumptions and implications have, for the most part, not been 
questioned. This chapter presents an integrated critique of DU, enu- 
merating a series of intertemporal choice anomalies that run counter 
to the predictions of DU. 

Perhaps most important, sensitivity to time delay is not well ex- 
pressed by compound discounting. A given absolute delay looms 
larger if it occurs earlier rather than later; people are relatively insensi- 


*Our analysis may help to explain Fama's (1984) finding that, contrary to the liquid- 
ity preference hypothesis, the yield curve tends to drop, on average, past a certain 
point. 
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tive to changes in timing for consumption objects that are already 
substantially delayed. Second, the marginal utility of consumption at 
different points in time depends not on absolute levels of consump- 
tion, but on consumption relative to some standard or point of refer- 
ence. Generally, the status quo serves as reference point; people con- 
serve on cognitive effort by evaluating new consumption alternatives 
in isolation, rather than integrating them with existing plans. 

Our model by no means incorporates all important psychological 
factors that influence intertemporal choice. For example, like any 
model with nonconstant discounting, it yields time inconsistent be- 
havior or "myopia," as Strotz (1955) called it. However, it cannot 
explain the high levels of conflict that such myopic behavior often 
evokes. Intertemporal choice often seems to involve an internal strug- 
gle for self-command (Schelling, 1984). At the very moment of suc- 
cumbing to the impulse to consume, individuals often recognize at a 
cognitive level that they are making a decision that is contrary to 
their long-term self-interest. Mathematical models of choice do not 
shed much light on such patterns of cognition and behavior (but see 
Ainslie, 1985). 

Such episodes of internal conflict are not entirely random. Certain 
types of situations, such as when a person comes into direct sensory 
contact with a choice object, seem to elicit especially high rates of 
time discounting while others do not. People exhibit high rates of 
discounting when driven by appetites such as hunger, thirst, or sex- 
ual desire. While not incompatible with the present model, these 
phenomena are not predicted by it. 

Finally, our model does not incorporate preference interactions 
between periods, despite the fact that our own recent empirical re- 
search has shown such interactions to be pervasive when people 
choose between sequences of outcomes. Preference interactions are 
revealed through a strong dislike of deteriorating outcome sequences, 
and through a liking for evenly spreading consumption over time 
(Loewenstein and Prelec, 1991). A taste for steady improvement 
seems to capture the preferences of most subjects, when sequences 
are being considered. Generally, the present model is more applicable 
to short-range decisions involving simple outcomes rather than long- 
term planning of consumption. No simple theory, however, can hope 
to reflect all motives that influence a particular decision. We have 
attempted to demonstrate that a theory with only two scaling func- 
tions can explain much of the observed deviation in preference from 
the normative discounted utility model. 
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Using Classifier Systems to Study Adaptive _ 


Nonlinear Networks 





Мапу systems of high interest to humankind—economies, political organiza- 
tions, games, ecologies, the central nervous system, developing organisms, biologi- 


cal evolution, etc.—rarelv, if ever, “settle down” to some repetitive or other easily 
described pattern. Such systems arc 


= intrinsically dynamic (When they settle down they are “dead” or uninteresting.) 

а far from a global optimum (There 15 always room for further improvement, 
though the system may perform quite well in a comparative sense.) 

ш 


continually adapting to new circumstances (The strategies or structures that 


determine the system's interactions continually change, often with accompany- 
ing improvements in performance.) 


Even the evolution of strategies in such simply defined “universes” as chess and 
Go illustrates the point. Play of these games has steadily improved over time, no one 
believes that current strategies are anywhere near optimal, and the games would be 
much less interesting if somehow we did know the optimal strategy. Systems with 
these characteristics pose substantial problems for those who would study them 
formally because classical tools based on linearity, fixed points, attractors, etc., at 
best provide an entering wedge. | 

A brief inspection shows that all the systems mentioned involve a large num- 
ber of “agents” adapting to cach other in a complex network of local, nonlinear 
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interactions. It is convenient to label these systems adaptive nonlinear networks 
(ANN’s hereafter), though we are far from having an overall theory to justify the 
label. In many respects, the study of ANN's turns on the answer to a single ques- 
tion: How does an ANN adapt to a perpetually novel environment that continually 
offers opportunities for further improvement? Three interacting subsystems must 
be defined in order to pursue an answer to this question: (1) the environment in 
which the system acts, (2) the structures that generate the system’s actions, and 
(3) the mechanisms that progressively adapt the system's structures to the envi- 
ronment. These lectures outline a set of computationally defined versions of these 
subsystems, accompanying them with examples and some relevant theorems. 





Input interface | Bucket brigade 
detectorz | |(edjusts rule strengths) 
г | Genetic algorithm 
(generates new rules) 


ЛТ: 


FIGURE 1 Overview. 
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THE PERFORMANCE SYSTEM 
[See Figure 1] 

The term performance system will be used to name the system-environment 
complex (subsystems (1) and (2) above) that exists at any instant. The performance 
system specifies the ANN’s capabilities at a given point in time—its abilities in the 
absence of further adaptation. The adaptive mechanisms modify the performance 
system on the basis of its experience. Because adaptation is a pervasive process, 
always underway, the performance system is a continually changing object. It serves 
as grist for the mill provided by the adaptive mechanisms. 

A description of the performance system requires, first of all, specification of 
the system's ways of interacting with its environment. We adopt the common view 
that the state of the environment is conveved to the performance system via a set 
of detectors (e.g., rods and cones in a retina) and that the system acts upon its 
environment via a set of effectors (e.g., muscles). We:also adopt the view that the 
outputs of the detectors can be treated as standardized packets of information— 
messages. In these terms, the performance system becomes a system for processing 
messages from the environment in order to determine ongoing effector settings. 

There is another interaction with the environment that plays a central role 
when we come to discuss the inductive mechanisms. The environment, under some 
circumstances, provides the system with payoff (reward, reinforcement). It is much 
as if the system were repeatedly plaving games such as checkers, chess or poker, 
in which some situations amount to overt "wins" or “losses,” and payments are 
made accordingly. The rate at which the system acquires pavoff is a measure of its 
performance, and a vital element in any careful discussion of learning or adaptation. 
It is important-that in most realistic situations this payoff is intermittent or sparse. 

The performance system сап be looked upon as a kind of office wherein the 
message list is a bulletin board holding the memoranda that must be handled that 
day. Each rule corresponds to a "desk" that has responsibility for certain kinds of 
memos. Át the beginning of a day, each desk collects the memos for which it is 
responsible, it processes them, and then, at the end of the day, it posts the memos 
that result from its processing. Thus, the memos on the board at the beginning of 
the day come either from the previous day's work (messages produced by the rules) 
or from outside the office (messages produced by the environment). Some memos 
cause actions outside the office (messages that control effectors). The parallelism of 
the system is evident. All rules simultaneously check the message list, and all rule 


actions are taken simultancously because they simply add messages to the message 
list. 
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FIGURE 2 Application of overview to gas-pipeline control. 


THE GAS-PIPELINE EXAMPLE (1) 
[Sce Figure 2] 

Goldberg's? system was designed to simulate the induction of the expert knowl- 
edge required to regulate gas-pipeline transmission. The system's objective is to 
meet demand at the end of a pipeline as economically as possible. This demand 
varies on an hourly basis and on a seasonal basis. Moreover, the pipeline may 
suffer transient leaks that upset the system's ability to deliver gas at appropri- 
ate pressures. The pipeline itself is a nonlinear system involving both storage and 
transmission, making it a fairly complex simulated environment. 
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At the beginning of each (hourly) time-step, the input interface supplies a mes- 
sage summarizing the readings of a set of gauges and detectors in the environment: 
inflow, outflow, inlet pressure, outlet pressure, pressure change rate, season, time 
of day, time of year, and current temperature reading. It is important to note that 
none of these readings directly detects a leak; the system must induce the con- 
ditions under which a leak is present. The major control variable available to the 
system—the effector in its output interface—determines settings for pipeline infiow. 

Payoff for the system is based upon the relation between the pressure delivered 
and the demand. The system also receives some payoff for successful leak detection 
and for appropriate action (for example, when the system acts to return the flow 
to an acceptable pressure when the pressure is out of acceptable range). 

There will be further discussion of this example at appropriate points along the 


wav. 
MESSAGE LIST RULE LIST 

k-bit strings cond; , cand, { mesage [strength] 
10011011 10766ееғе, 92220092 | 00110000 [68 ] 
00100000 11111111 00000000 / 11111111 [83] 
11101100 . = 

20011099 , 00424042 / 11111111 [240] 

$} ) 
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# 


жрестјет тиђзес 
of messages 


Generality/Specificity: More *'s in condition => wider range of 


All rules check all mestagos simultaneously for matches. 
Rules with matched conditions bid to post messages. 
bid = c(strength) (specificity) 
k-(no. of ++) 


Winners аге chosen with a probability proportionel to the 
size of their bids (there will generally be many winners). 


Winning rules post messages. 


FIGURE 3 Messages and rules. 
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MESSAGES AND RULES 
[See Figure 3] i 

The computational basis gains greatly in uniformity if we require that messages 
mediate internal processing and supply information about the environment. This 
provides a particularly simple form for the condition/action rules that serve as the 
basic components. Each rule is a simple message processor: Conditions “look for" 
certain kinds of messages and, when the conditions are satisfied, the action specifies 
a message to be sent. Many rules can be active simultaneously, so many messages 
may be present at a given time. These messages determine both the internal (rule) 
and external (eflector) activity of the performance system. The messages are col- 
lected in a list that changes under the combined impetus of the environment and 
the rules. Messages are ephemeral, lasting only a single time-step (*day")—a rule 
must repeatedly post a message to keep it on the list for longer periods. 

Because the conditions of rules refer only to messages, it is easy to determine 
the rclative specificity of a condition. The most specific condition is one that will 
accept one and only one message. The most general condition is one that accepts 
anv and all messages indiscriminately. In between are conditions that are satisfied 
bv some messages but not by others. One condition is more specific than another if 
the subset of messages that satisfy the one is smaller than the subset that satisfies 
tlie latter. 

Classifier systems are a particular class of message-passing, rule-based systems. 
In a classifier system using k-bit messages, a condition is specified as an element 
of {1,0,#}* where # is а “don’t care" ог “wildcard” that allows the condition to 
accept any bit value at that position. For example, 1##...# is a condition that 
15 salisfied by any message starting with a 1. 

It is a central tenet of our approach that all rules serve as таннен more or 
less confirmed, rather than as incontrovertible facts. The cognitive system's reliance 
upon a rule is based upon its average usefulness in the contexts in which it has been 
tried previously. (Determination of this usefulness is a matter of credit assignment 
which will be discussed further on). This rating is summarized in a quantity called 
the rule's strength. A rule's relative advantage in competition with other rules is 
based upon its strength. 

If we compare this process to the central nervous system, we see that individual 
rules play a role somewhat analogous to that of individual neurons. Messages are 
the counterparts of pulse trains moving over axons, conditions play the role of 
synapses in filtering the pulse train, and the action of the rule is the counterpart 
of the pulse train produced by the neuron on its outgoing axon. Realistic neurons, 
with time-varying thresholds, habituation and fatigue effects, integrative dendritic 
propagation, etc., have more sophisticated processing capacities than the rules we 


use; on the other hand the rules we use, though they are simple, probably make 
more detailed use of the information they process. 
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PARALLELISM AND “BUILDING BLOCKS” 
[See Figure 3] 

Parallelism makes it possible for the performance system to combine rules 
into clusters that model the environment, thus avoiding the disadvantages of using 
monolithic, predetermined structures to handle the torrent of relevant and irrele- 
vant data impinging upon the system. By using different rules to describe different 
aspects of the current situation, we gain two important advantages: 


1. Combinatorics work for the system instead of against it. The advantage is 
similar to that obtained when one describes a face in terms of components 
instead of treating it as an indecomposable whole. If we select, say, 8 com- 
ponents for the face—hair, forehead, eyes, nose, mouth, chin, and the like— 
and allow 10 alternatives for each, then one hundred million faces can be de- 
scribed by combining components, at the cost of storing only 80 individual 
components. 

2. Experience can be transíerred to novel situations. A given rule can be used 
as a building block in many combinations, Just as a single alternative for a 
facial component, say, a particular nose shape, can be used with alternatives 
for each of the other components (ten million possibilities). If the rule proves 
useful in a fair sample of these contexts, it is at least plausible to believe it 
will prove useful in similar combinations not yet encountered. To exploit 
these possibilities, the rules must be organized in a way that permits facile 
reorganization of the system as it gains experience. See the topic “Default 
Hierarchies and Internal Models” below. 


а 


COMPETITION 


[See Figure 3] 

All rules that have satisfied conditions at any given time enter a compétition for 
the right to post their messages, and only the winners of this competition actually 
post messages. Usually there will be several winners and several losers. 

We introduce competition for a variety of reasons. First of all, competition pro- 
vides a simple, situation-dependent means of resolving conflicts between rules. This 
is particularly important when sets of concurrently active rules are used to generate 
responses. Maintaining the consistency of all the combinations of rules that could 
be active simultaneouslv would be an overwhelming task, a task that would defeat 
the advantages of parallelism. Morcover, the rules satisficd in a given situation of- 
ten serve as alternative hypotheses about the situation. To the extent that they 
are genuine alternatives they are necessarily inconsistent. When such alternatives 
are presented, the cognitive system must decide which one(s) to favor, even though 
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the particular combination of rules satisfied may not have been encountered before. 
Under competition this resolution is experience-based. 

To implement the competition we assign to each rule a quantity called its 
strengih. Later on we will see that a rule's strength is adjusted by a credit assign- 
ment algorithm to reflect the rule's past usefulness to the system. The competition 
itself turns out to be an important vehicle for carrying out the credit assignment 
process. 

The actual competition is based on a bidding process. Each satisfied rule makes 
a bid based upon its strength and its specificity. It is useful to treat the bid as some 
proportion of the rule's strength, a greater proportion being allocated if the rule's 
conditions are more specific. (In a classifier system, the bid ratio is determined by 
the number of specified bits divided by the total number of bits, k.) A rule that has 
been useful to the system in the past (high strength) and uses more information 
about the current situation (high specificity) will make a higher bid, and will tend 
to be among the winners of the competition. Various criteria for winning can be 
emploved. For example, the probability of winning can be based on the size of the 
bid, or all rules making bids at least equal to the average bid could be declared 
winners. 


THE GAS-PIPELINE EXAMPLE (ll) 


The classifier system that is coupled to the gas-pipeline environment uses a message 
length of 16 bits, a message list that holds 5 messages, and set of 60 classifiers. This 
is a very small.system for such a complex problem. That it can accomplish so much 
bodes well for the theory underlying it. It is an interesting aside that the system 
was tested and run on a microcomputer with a 61K memory! In а typical run there 
were 24 time-steps per day and the system achieved near-optimal (expert-level) 
performance in about 1,000 days of simulated experience (24,00 time-steps). 


DEFAULT HIERARCHIES AND INTERNAL MODELS 
[See Figure 4] 
Because competition favors more specific classifiers, the performance system 
tends to organize itself into default hicrarchies. The simplest example of a rule-based 
default hierarchy consists of two rules: The first (“default”) rule has a relatively un- 
specific condition and provides an action that is sometimes correct and sometimes 
incorrect. The second (“exception”) rule is satisfied only by a subset of the messages 
satisfying the first rule and its action generally corrects errors committed by the first 
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rule. That is, the second rule uses additional information (its more specific condi- 
tion) to distinguish messages that lead the first rule astray. Note that, when a mes- 
sage satisfies both the first and second rule, the second rule is favored in the compe- 
tition because it is more specific. As a result, there is a kind of symbiosis between the 
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moving near" "srnall* 


171049406 is а more specific condition that 144ee44** and 
hence tends to win competitons when both conditions are 
satisfied. ; 


The emerging defeult hierarchy is “symbiotic”. 
© prevents (1) from making mistakes, therefore 
increasing (2) "з net payoff rate, while (2) increases 
the overall payoff rate. 


FIGURE 4 Emergent default hierarchy. 


two rules. The specific rule both provides the correct action, and saves the general 
rule from a mistake when it prevents that rule from winning. The latter action saves 
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the general rule from losing strength under the credit assignment algorithm, as we 
will see shortly. Consequently, not only is the system as a whole better off for the 
presence of the second rule, but the first rule itself is better off. = 

The exception rule may іп turn make errors that сап be corrected by still more 
specific exception rules, and so on. Whence comes a multilevel, possibly tangled, hi- 
erarchy. We will see soon that such default hierarchies are easily learned in parallel, 
competitive systems that give an advantage to more specific rules. Each such hier- 
archy acts as a skeleton from which the performance system constructs its internal 
models. The skeletons are fleshed out by associations and predictions provided by 
rules coupled to rules in the hierarchies. 


THE GAS-PIPELINE EXAMPLE (11) 


The control system starts witli a sct of randomly generated classifiers. The bucket 
brigade algorithm (see below) strengthens the best rules in this set. They are typi- 
cally overly general, incomplete, sometimes counterproductive rules. In the problem 
of leak detection, which is of most interest from the point of view of adaptation, 
the overly general rule retained in one case was 

IF {anything (a condition with all #5) 

THEN [send “no leak" message] 
Such rules supply a starting point for the genetic algorithm (see below) and often 
serve (at least initially) as the top, most general level of the emerging default hier- 
archy. The genetic algorithm recombines parts of rules strengthened by the bucket 
brigade algorithm, eventually producing a rule like 

IF [input pressure low, output pressure low, rate of change of pressure very 

negative], 

THEN [send "leak" message] 
Both rules persist in the system, the first indicating normal operation and serving 
as a default, the second serving to indicate an exceptional condition. When the 
conditions of both rules are satisfied, tlie second rule wins, if its strength is in the 


same range as that of the first rule, because of the higher bid based on its greater 
specificity. 


COUPLING AND TAGS 
[See Figure 5) 

Coupling is the mechanism that makes rule sequencing and association possible. 
In broadest terms one rule R, is coupled to another rule R, if the message produced 
by В, tends to cause Ез to take action (cf. one neuron synapsing with another). 
More carefully, rule К, is coupled to rule Ra if the message specified by Б, satisfies 
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at least one of the conditions of Rz. As a concrete example consider the following 
two rules: | 


[Е] Е. 'ѕ conditions are satisfied whenever there is a “small, fast-moving 
object” in the environment. R's action part specifies a message having 
a part, say a suffix, that can be interpreted as an “alert” to the system 
(Figure 5 gives a related example). 

(Ез) R,’s condition is satisfied by апу message with an “alert” suffix. R2’s 
action part specifies a message that initiates a process for “centering” 
objects in the visual field. 


Mezesges аге azzigned а tag region (say та prefix) 
[000811001...001...0 


Ying region [0000 - “message from input interface*] 


(Classifiers are coupled by tags 
Classifier (C) із coupled to clasntiers (=) «па vie tag 1000: 


“from input interface" 


ra "prey" 
(s) ПШІ11%0%%...909... / BDORi1...11 
-—— / | \ *"non-:triped" 
“small” 
2 "execute 'pursue' 


4  зецмелсе“ 


Р" 
(S СО“... /(000000...0 


"rom input interface* | 
ж 
(Б) @000#1*...^1...1* / (3X0 11100...0 
| | 


J iN \ "on-the-ground" 


"round“ "dull-colored" 


FIGURE 5 Tags and coupling. 
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Informally, Ё; has the form ы 

IF “small, fast-moving object” THEN “alert,” 
and Rə has the form 

IF “alert” THEN “center.” | 
Because of the coupling, Rı will be instrumental in causing “small, fast-moving” 
objects to be “centered.” 

Note that rules other than Rz may have conditions satisfied by either the "alert" 
suffix or by other parts of R,’s message, so that Rı may be coupled to many rules. 
Similarly, Rz is activated by any rule that sends a message with an “alert” suffix. 
(In a classifier system, this means that the condition consists of “don’t cares” except 
at the positions of the “alert” suffix). Thus, К» can serve as а component in many 
contexts, much as a facial feature in the earlier example serves in the description 
of many faces. 

Coupling, in conjunction with the simultaneous execution of rules, makes it 
possible for the system to respond to particular situations with coordinated clusters 
of rules. A “red Saab with a flat tire by the side of the highway” is handled by a 
cluster of coupled rules each responding to some aspect of the situation. Even if 
the performance system has never encountered the “red Saab...” situation before, 
it can put together a plausible response if it has rules relevant to the components 
of the situation. 

The sequential activations inherent in the coupling mechanism also can be 
used to provide associations between rules and the situations activating them. (The 
process is similar to Hebb's? linking of cell assemblies to form phase sequences, or 
the associations Fahlman! achieves by passing markers over a semantic net). Part 
of a rule’s emitted message can be used, as was the “alert” suffix in the previous 
example, to “direct” the message to rules having conditions responsive to that 
part. Part(s) of a message that are used to “address” that message to other rules 
are called tags. If a rule's condition is satisfied by any message bearing a particular 
tag (a particular suffix, prefix, or the like), then that rule effectively has an address. 
Messages can be sent to it by providing them with the appropriate tag. If there are 
several rules sensitive to a given tag then they will be activated as a cluster. 

When rules in a cluster have more than one condition, the additional condi- 
tions can be used to make the associations sensitive to context. Using the example 
above, R might have a second condition specifying that the message list contain 
no message with the suffix “imminent danger." R$ would then take action only if 
there were an "interesting object" and no “imminent danger." As we will see, tags 
are easily “invented” and modified by the inductive mechanisms used in this frame- 


work, so that associations and sequences based upon the tags are readily modified 
on the basis of experience. 
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ADAPTIVE MECHANISMS 


We are now ready to begin discussing the ways in which experience, through learn- 
ing, modifies the performance system. The first problem for the cognitive system is 
to evaluate the rules it already has. This problem, the credit assignment problem, 15 
particularly important when the system has many rules and all of them are treated 
as hypotheses subject of varying degrees of confirmation. Inevitably, under credit 
assignment, the system uncovers rules that are of little value to it. It will even be 
the case, as the system gains experience, that once-valued rules decline greatly in 
value. It is natural to replace such rules with newer hypotheses that have a greater 
potential for being valuable in the context of the rules that are well valued. The 
problem here, the rule discovery problem, is to make the rule generation process 
dependent on experience, so that newlv generated rules are at least plausible in 
terms of that experience. We will discuss each of these problems in order. 


At time t 
Classifier C is a supplier oí classifer С” if C posts а message 
that zatisíiecz а condition of С”; 
< ie а cansumar об С if C then wins the competition. 
t-i t t*1 
c Qmd C —— С" 
Se СИ] = 


lf C and C" win in the competitions at t and 0+1 then 
C is first а consumer (of C) then а supplier (of С”). 


When « classifier has won ths competition it immediately 
1) poste ite message for use on the next time-step 
Z) pays its bid to ita supplier(s) thereby reducing its strength. 


A clessifier bas two sources of income: (1) bids from its 
consumers and (2) payoff from the environment. 


Coupled classifiers form the "bridges" for the bucket brigade. 


Each classifier is а “middlaman* in complex economy. It will 
be strong only if (1) it is consistently active et times of 
payoff, or (2) it belongs to coupled sequences leading to payoff. 


FIGURE 6 The Bucket Brigade. 
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CREDIT ASSIGNMENT 
[See Figure 6] 

Note that credit assignment is not particularly difficult where the situation 
provides immediate reward or precise information about correct actions. It becomes 
quite difficult when credit must be assigned to early-acting rules that set the stage 
for a sequence of actions leading to a favorable (rewarding) situation. Now the 
system must decide which rules active along the way actually contributed to the 
outcome. Parallelism adds to the difficulty. At any step along the way, only a few of 
the active rules may contribute to the favorable outcome, while others are ineffective 
or, even, obstructive. Somehow the credit assignment algorithm must sort out all 
of this, modifying rule strengths appropriately. 

We use competition as the vehicle for credit assignment. To do this, we treat 
cach rule as a “middleman” in a complex economy. At any given time its suppliers 
are those rules that have sent messages satisfying its conditions, and its consumers 
are those rules that both have conditions satisfied by its message and have won their 
competition iu turn. Stated another way, the middleman is coupled to its suppliers 
and its consumers are coupled to it. Under this regime, we treat the strength of a 
rule as capital and the bid as an actual payment to its suppliers. That is, when a 
rule wins, its bid is actually apportioned to its suppliers, increasing their strengths 
by the amounts apportioned to them. At the same time, because the bid is treated 
as a payment for the right to post a message, the strength of the winning rule is 
reduced by the amount of its bid. Should the rule bid but not win, its strength 
is unchanged and its suppliers receive no payment. We call this credit assignment 
procedure a bucket brigade algorithm. 

Winning rules can recoup their payments in two ways: (1) They in turn have 
winning consumers tliat make payments to them, or (2) they are active at a time 
when the system receives payoff from the environment. 

In case (2) we reach the point, alluded to in the initial discussion of interactions 
with the environment, where payoff affects the performance of the system. When 
the system acquires payoff from the environment, it is divided among the active 
rules, their strengths being increased accordingly. Only rules directly active at the 
time of payoff share in that payoff; the system must rely on the credit assignment 
algorithm to distribute its effects to other rules. 

In broad outline; the “middleman” approach works because rules will become 
strong only if they are coupled into sequences leading to payoff. To see this, note 
first that rules consistently active at times of payoff tend to become strong because 
of the payoff they receive. As these rules grow stronger, they make larger bids. 
The strength of a rule coupled to one of the “payoff” rules—a supplier of the 
*payoff" rule—then benefits from these larger bids. Subsequently, the suppliers of 
the suppliers begin to benefit, and so on, back to the early stage-setting rules. 

A supplier might, of course, convert the environmental state to one that diverts 
its consumer from a payoff-directed path—it might fail in its stage-setting role. In 
that case the consumer will suffer because the diversion will prevent it from receiving 
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payments from its consumers. However, the diverting supplier generally suffers even 
more because it is at an earlier stage in its “getting rich” effort. As a result, the 
supplier producing the diversion soon loses enough strength to make it no longer a 
factor in the competition. 

22 The whole process, of course, takes repeated “plays of the game.” But it only 
requires that a rule interact with its immediate suppliers and consumers. It requires 
no overt memory of the long and complicated sequences leading up to payoff. Avoid- 
ing such overt memories is almost a sine qua non for large, parallel systems acting 
in complex environments with sparse payoff. Overt memories would necessarily 
involve many tangled strands including unnecessary detours and incidentals. To 
tease out the relevant strands in timely fashion when payoff occurs would be an 
overwhelming, hardly feasible task. 

The coupled sequences involved in the credit assignment algorithm fit naturally 
into the default hierarchies discussed earlier. It is also possible to increase the 
subtlety of the credit assignment process, and the level of analysis, by incorporating 
other mechanisms from economics, such as taxes, subsidies, discountings, etc., but 
these are matters of ongoing гезсагсћ. 





THE GAS-PIPELINE EXAMPLE (IV) 


Note that the “profitability,” and eventual strength, of the rule 
IF [input pressure low, output pressure low, rate of change of pressure very 
negativel, 
THEN [send “leak” message] 
depends upon its being coupled via its message to rules that take appropriate actions 
under leak upset and hence result in payoff under such conditions. Of course, these 
effector rules also must be discovered by the genetic algorithm (see below). 





RULE DISCOVERY 


Generating plausible replacements for rules assigned low strength under the credit 
assignment algorithm is an even more daunting task than credit assignment itself. 
In a rule-based system, the whole process of induction succeeds or fails in proportion 
to its efficacy in generating plausible new rules. However, plausible is not an easy 
concept to pin down. It implies that experience biases the generation of new rules, 
but how? 

We propose that plausibility is closely linked to the “building block” approach 
set forth in the discussion of parallelism. Applied to an individual rule, the building 
block approach requires that the rule be viewed, not as something monolithic, 
but as an entity constructed from well-chosen parts. We have traded plausible for 
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well-chosen, with the direct gain that “well-chosen-ness” is something that can be 
estimated on the basis of experience. If a part has been used in several rules, then 
those rules constitute samples drawn from the set of all rules that can use that part. 
The strengths of the rules in the sample allow us, for example, to make an estimate 
of the average strength of rules using that part. Though estimates are subject to 
error, they do provide an experience-dependent guideline. Both the possibility of 
error and role of experience are consonant with the term “plausibility.” 

Because it is the value of a part in a contert that is usually of interest, estimates 
accumulate at different rates. Just as a general condition for a rule will be satisfied 
more frequently than a specific condition, so, in a sample of possible rules, a part in 
a simple context will appear more frequently than a part in a complicated context. 
In consequence, as the system tries sets of rules against the environment, it accu- 
mulates information more rapidly about parts in simple contexts. It is not difficult 
to show that the rate of accumulation falls off exponentially with the complexity 
of the context. | 

This automatic differential in sampling rates has a strong influence on what 
parts are well chosen at any point in time. Early on, the system has reliable infor- 
mation only about parts in very simple contexts. It can exploit this information, 
but more complex contexts will provide frequent surprises, departures, and excep- 
tions. As the system gains experience, it gains information about more complicated 
contexts, and it can bias its choices accordingly. Consequently, the system 15 prone 
to build hierarchies that grow from early “defaults,” based on simple contexts, to 
lavers of exceptions based on more detailed contextual information. 

For convenience, let us call a part embedded in an appropriate context a “build- 
ing block.” Roughly, then, we can say that a building block is well chosen at a 
given time if the system can make a well-founded estimate that its use yields above- 
average rules. Even a single rule constitutes a sample point for a great many possible 
building blocks, because the rule has many parts and there are many possible con- 
texts within the rule for each of those parts. Consider, then, a sample consisting of 
a few thousand rules. In that sample, great numbers of building blocks will appear 
in at least one hundred rules. Accordingly, each such building block lias at least a 
hundred sample points upon which to base an estimate of its average contribution. 
It follows that the system quickly accumulates the information necessary to make 
good estimates of the relative rank of multitudes of building blocks. 

If the system can exploit this information, it can bias the rule generation process 
so that building blocks that have tested above average are favored in the construc- 
tion of new rules. However, there is a difficulty. The large numbers of building 
blocks involved make an explicit calculation of the relevant averages an infeasible 
task. Fortunately there is a way of achieving the effect of these calculations without 
carrying them out explicitly. 

| We can best explain the process by resorting to a metaphor from genetics. 
The system selects high strength rules as “parents,” producing “offspring” rules 
by exchanging parts between the parents. The resulting offspring then replace low- 
strength rules (not the parents) іп the population of rules. Though it is not obvious, 
it can be proved that this process biases the generation of rules so that building 
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blocks associated with above-average rules are more frequently used in the construc- 
tion of similar new rules. Moreover the bias is proportional to diference between 
the average associated with the building block and the system-wide average. 


GENETIC ALGORITHMS 
(See Figures 7, 8, and 9] 

The generation of plausible rules is predicated upon appropriate credit as- 
signment (the direct influence of experience) and biased recombination of well- 
chosen building blocks to form new rules. The new rules act as new hypotheses 
to be tested in the context of the more-established rules that the system already 
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FIGURE 7 Genetic Algorithm l. 
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Genetic Algorithm 
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3 3 


Let M(s,t) = number of instances of zchema = аб time t. 
Then-M(s,t*1) ~ u(s,t)M(s.t). 


IMPLICIT PARALLELISM: The genetic algorithm biases 
future trials of each "building block” = according to its 


estimated fitness u(s,t) (without explicitly calculating 
that average). 


FIGURE 8 Genetic Algorithm Il. 


possesses. There are ways to further bias the rule generation process toward forming 
causal links, incorporating information about unusual events, associating salient 
eventa, etc., but these detailed considerations аге left to the literature (see Holland 
et al.”). 

It is important to note that recombination affects tags just as it affects other 
parts of a rule. Thus it is easy for thc induction process to invent new tags incorpo- 
rating parts of established tags. This applies to both the condition part of the rule 
and the action (message generating) part of the rule. As a result established tags 
spawn related tags, providing new associations, higher level categories, and new 
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interactions between established categories. When these new tags implement useful 
couplings—as measured by the strengths of the rules involved—they in turn spawn 
new variations for test. In effect the system is building an experience-based system 
of symbols for interior use. In doing so, it extends the default hierarchy skeleton of 
its models and then fleshes that skeleton -with appropriate associations. 

At first sight it may seem that recombining parts of rules has no counterpart 
in the processes we typically associate with the central nervous system. However, 
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FIGURE 9 Genetic Algorithm Ш. 
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more thought reveals recombination as a recurrent theme in physiologically based 
theories of learning and induction. We can take Hebb’s® treatise as a well-known, 
highly influential example. In Hebb’s theory ceil assemblies play a role somewhere 
between a rule and a cluster of rules, acting in parallel, competing, and broadcasting 
their messages widely via the large number of synapses involved. Cell assemblies are - 
continually being restructured, under environmental influence, through processes 
of fractionation (production of offspring) and recruitment (addition of parts from 
other cell assemblies). Moreover, similar processes take place when cell assemblies 
are integrated into larger structures such as phase sequences. It is not difficult upon 
rereading Hebb to sce counterparts of all the processes discussed. 


THE GAS-PIPELINE EXAMPLE (V) 


The genetic algorithm was applied every 200 time-steps in order to give the bucket 
brigade time to summarize experience between changes in the rules. Thus a typical 
run involved approximately 120 generations under the genetic algorithm. 

As the system gains experience, increasing specificity and correctness of the 
“leak indicator” rule 


IF [input pressure low, output pressure low, rate of change of pressure very 
negative], 

THEN [send "leak" message], | 
and of the corresponding effector rules, develop hand in hand. That is, parts of 
overly general, but better than random, rules are interchanged (crossed) to pro- 
vide ever more specific, plausible refinements. Refinements that do indced improve 
performance then serve under the genetic algorithm as the source of parts for still 
further refincments. The emergent default hierarchy, with its associated diachronic 


rules, is a consequence of this continual refinement of rules that improve perfor- 
mance. 


USING A GENETIC ALGORITHM TO DISCOVER STRATEGIES 
FOR THE PRISONER'S DILEMMA 
[See Figure 10] 

The story: Two prisoners are being interrogated in separate rooms concerning 
a crime they have allegedly committed. Each has a choice between informing on 
the other prisoner (“defection”) or else remaining silent ("cooperation"). If both 
remain silent, then both go free; if one informs and the other remains silent, then 
the informer reccives an informer's fee and goes free, while the other prisoner is 
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Iterated Priszoner's Dilemma = 


Player B 


Player A | 





Strategies (based оп last three moves). 4 possible outcomes for each move 


=> 4х4х4 = 64 histories. 
A strategy specifies а choice for each history. 


Minimax: Both players defect. 


Tit-for—tat: Player copies move made by other player on previous 


move. 

Axelrod computer tournament (stategies submitted by economists, 
computer scientists, political scientistz, etc. from all over 
the world): Tit-for—tat prevails inducing cooperation 
(it is an evolutionary stable strategy). 


Genetic algorithm applied to strategies: discovered e previously 
unkown strategy thal does betler than tit-for-tat (іп the mix of 
strategies used іп the tournament). 


Axelrod, В. The evolution of stratemer in the iterated prisoner's 
dilemrume. іп Genewe <lporthoi and Simulated 


Annealing. Devis, L.D.(ed.). Kaufmann: Los Altos 
1987. 


FIGURE 10 The Prisoner's Dilemma. 


imprisoned and fined (to cover the informer's fee); if both inform, then both are 
imprisoned (but there are no fines). If each prisoner tries to minimize the maximum 
damage that can be imposed by actions of the other prisoner—the formal minimax 
solution of the game—then defection is the indicated action. Accordingly the for- 
mal solution is that both prisoners defect. However, the observed (experimental) 
outcome, when the game is played repeatedly by the same two players (the iterated 
prisoner's dilemma), is that players typically learn to cooperate after a time. 

The problem: The best outcome a prisoner can attain occurs if that prisoner 
defects while the other prisoner does not. This, together with the fact that defection 
minimizes damage if the other prisoner does defect, makes it very tempting to 
defect. How is it, then, that the prisoners learn to cooperate? 
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One individual strategy that induces cooperation, a strategy called tit-for-tat, 
simply copies on the next move whatever the other prisoner did on the current. 
move. Under this strategy, if the other player begins to cooperate over some ex- 
tended period, then the tit-for-tat player will also cooperate, while a defection by 
the other prisoner is punished by the tit-for-tat player with a defection on the next 
move. If both players adopt a tit-for-tat strategy then, after any instance of co- 
operation, cooperation will continue as long as they both hold to this strategy. It 
has been shown by Axelrod in computer tournaments that tit-for-tat prevails over 
all strategies submitted. Axelrod also demonstrated that a genetic algorithm, when 
used as a learning technique, can discover strategies that are better than tit-for-tat. 


SOME THEOREMS ABOUT CLASSIFIER SYSTEMS 


The fundamental theorem for genetic algorithms (see Holland*) can be rewritten 
as a theorem about progressively biasing a probability distribution over the space 
{1,0}*: ; 

1 


ТНЕОВЕМ. 
p(s,t+1) > [1 — o(s, t))(1 — Prue] [u(s, t)/u(t)]p(s, t), 


where p(s,t + 1) is the expected fraction of the population that will be occupied 
by the instances of s at time t + 1 under the genetic algorithm, given that p(s,t) is 
the fraction occupied by s at t. 

The factors on the right hand side are: (1) [u(s,t)/u(t)], the ratio of the ob- 
served average value of the schema 5 to the overall population average. This term 
determines the rate of change of p(s,t), subject to the "error" terms 


(1 - e(s,t)][1 — Pral h, 


If u(s,t) is above average, the proportion of schema s increases (if the error terms 
are small), and vice versa. (2) о(5,4) and Pme:, the “error” terms resulting from the 
breakup of instances of s because of crossover and mutation, respectively. Specifi- 
cally, c (s, t)p(s, 1) = Peress(l(s)/(E — 1)]p(s, t) is an upper bound on crossover loss, 
the loss of instances of s resulting from crosses that fall within the interval of length 
l(s) determined by the outermost defining loci of the schema. [1 — Pmu:}“) gives 
the proportion of instances of s that escape a mutation at one of the d(s) defining 
loci of s. (The underlying algorithm is stochastic so this equation only provides a 
bound on expectations at each time-step. Using the terminology of mathematical 
genetics, this equation supplies a deterministic model of the algorithm under the 
assumption that the expectations are actually achieved on each time-step.) 
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Communication network optimization (commercial 
application) [Davis, L. D. Genetic algorithms end 
communication link speed design. In Genetic Algorithms 


and Their Applications: Second International Conference 
(MIT 1987). Eribaum.] 


Medici 


Image registration (previously unsolved problem of great 
importance in medical diagnosis) [Fitzpatrick, J.M. et al. 


Image registration by genetic search. Proceedings of IEEE 
Souteastcon 84.] 


Poker (comparative study with Waterman's learning poker 
player) [Smith , S. F. Flexible learning of problem solving 
heuristics vie adaptive search. In Proceedings of 8th IJCAI.] 


Discovery of multiplezer boolean function (comparative 
study with connectionist approach showing GA-besed zystern 
substantially faster) [Wilson. 5. Classifier systems end the 
animat problem. Machine Learning 2. ] 


VLSI Compaction (difficult geometric problem) [Fourman. 
М.Р. Compaction of symbolic layout using genetic 
algorithms. Їп Genetic Algorithm: and Their Applications: 
First International Conference (CMU 1985). Егіһачгп.| 


FIGURE 11 Other applications. 


PROOF outline (see Holland‘ for details): 


1. Consider a schema with М (5,1) instances in the population В (1). The av- 
erage value of these instances is given by (5,4) = 5 xes u(z)/M(s,t). И 
each of these instances is copied with probability u(z)/u(t), there will be 
a u(r)/u(t) = u(s,t)AM (s,t)/u(t) instances of s expected in В’ after 
the copying. (The кекшш number of course will be subject to sampling error.) 

2. When the point of crossover falls within the outer limits of the defining po- 
sitions for a schema, the defining bits of the schema will be separated in the 
offspring (otherwise they are passed on intact). Under such circumstances, 
it is possible (but not necessary) tliat neither offspring is an instance of the 
schema, so that there is a “loss” of one instance in the process. Because the 
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point of crossover is chosen at random in each case, the probability that the 
cross falls within the outer defining positions, is given by [I(s)/(k — 1)], where 
l(s) is the number of crossover points between the outer defining positions of 
s. Thus P,,,ull(s)/(k — 1)] gives an upper bound on the probability that а 
given instance of s will be lost because of crossover during the formation of 
B". A similar calculation provides the loss rate because of mutation. 

It follows that the number M(s,t + 1) of instances of s to be expected after 
copying and crossover is bounded below by 


oM 


[1 — e(s,£))1 — Paus] '{u(s,t)/u(t)]M(s,t). | 


4. But p(s,t), the fraction of instances of s in the population B(t), is by defini- 
tion M(s,t)/M, so that 


p(s,t +1) > [1 — e(s D]. — Рај (з, О) моја). 
QED 


In any population that is not too small, distinct schemas will almost always 
have distinct subsets of instances if the number of instances is relatively small. For 
example, in a randomly generated population of size 2500, any schema defined on 
8 loci can be expected to have about 10 instances. There are 


2500 — 7 
(90 ea» 10 


ways of choosing this subset, so that it is extremely unlikely that the subsets of 
instances for two such schema will be identical. (Looked at another way, the chance 
that two schemas have even one instance in common is less than 10 x 27? = 1/25 
if they are defined on disjoint subsets of loci). Because the sets of instances are 
overwhelmingly likely to be distinct, the observed averages u(s, t), will be determined 
mostly by independent samples. As a consequence, the rate of increase (or decrease) 
of a schema s under a genetic algorithm is largely uncontaminated by the rates 
associated with other such schemas. Loosely, the rate is uninfluenced by “crosstalk” 
from the other schemas. 

From the point of view of sampling theory (applied to populations large enough 
that sampling without replacement is insignificantly different from sampling with 
replacement), 20 or 30 instances of a schema s constitutes a sample large enough to 
give some confidence to the corresponding estimate of u(s). Thus, for such schemas, 
the biases p(s,t) produced by a genetic algorithm over a succession of generations 
are neither much distorted by sampling error nor smothered by "crosstalk." 

To gain some idea of how many schemas are so processed consider the following: 
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THEOREM. Select some bound e оп the crossover error and pick 1” such that k'/k < 

e/2. (The theorem is only of interest when еЁ/2 > 1). Consider a population of 
size M = c,2*' where c, is a small integer (say, c < kV). If M is obtained as a 

uniform random sample from {1,0}*, the number of schemas propagated with an 

error less than e greatly exceeds МЗ. 


PROOF outline: 


l. Consider a "window" of 2k’ contiguous loci in a string of length k such that 
22/2 < e. Clearly any schema having all its defining loci within this window 
will be subject to a crossover error less than e. 


2. There are "T 
( н ) ~ 228 бај] Ма 


ways of selecting k' defining positions in the window, and there аге 2“ different 
schemas that can be defined using any given set of k’ defining loci. Therefore, 
there are approximately 235 /(х']-1/2 distinct schemas with ЈУ defining posi- 
tions that can be defined in the window. 


3. A population of size Af = c,2* obtained by a uniform random sampling of 
{1,0}* can be expected to have сү instances of every schema defined on | 
defining positions. Therefore, for the given window, there will be approximately 
МУ (се) [x1]? schemas having instances іп the population and defined on 
some set of E' loci in the window. 


4. The same argument can be given for schemas of length L’ — 1,17 — 2,..., and 
for k’ + 1, E +2,..., with values of 


2 
ке 77 


decreasing in accord with the binomial distribution. There are also k — [' — 1 
distinct positionings of the window on strings of length Е. It follows that many 
more than M? schemas, with instances in the population of size M , increase or 
decrease at a rate given by their observed marginal averages with a crossover 
error less than e. 


QED 


A genetic algorithm's ability. to meaningfully bias the sampling rate of a large 
number of schemas while processing a relatively small set of instances is called 
implicit parallelism (né intrinsic parallelism, Holland‘). 
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THEOREM. [Bucket brigade local fixed-point.] If, under the bucket brigade algo- 
rithm, 1, is the long-term average income (after taxes) of a classifier C, and re is 
its bid-ratio, then its strength Se will approach I./r.. E 


THEOREM. [Q-morphism parsimony; for definitions, see Holland et. 2151 A g-mor- 
phism of n levels, in which each successive level uses k or fewer additional variables 
to define exceptions to the previous level, and in which the rules at each level are 
correct over at least a proportion p of the instances satisfying them, requires no 
more than 5^; n2/*(1 — р))-! rules. 

(A homomorphism defined on nk variables requires 2" rules.) For п = 10, 
k = 2, p = 0.5, the ¢-morphism requires fewer than 212 rules, while a corresponding 
homomorphism would require 920 rules; that is, the homomorphism would require 
at least 256 times as many rules as the q-morphism. 





RECAPITULATION 


The basic elements of our framework are a performance system and a set of inductive 
mechanisms that continually modify the performance system on the basis of expe- 
rience. A performance system designed to operate in a realistic environment—an 
environment that is both rich and continually varying—must meet some stringent 
requirements. Morcover, tlie performance system must be protean, able to change 
to whatever organization is suggested by experience under the influence of the sys- 
tem's credit assignment and rule generation procedures. This tour has concentrated 


on six mechanisms designed to mect both the performance requirements and the 
requirement of inductive adequacy: | 


1. Parallelism, acting in concert with the "building block" approach, provides 
flexibility and transfer of experience. This enables the system to distinguish 
useful, repeatable events in a torrent of irrelevant and misleading sensory data. 


2. Competition allows the system to marshall its rules as the situation demands, 
and it allows the system to gracefully insert new rules without disturbing estab- 
lished capabilities. Most importantlv, competition allows all rules to be treated 
as hypotheses, more or less confirmed, thereby stepping around difficult global 
consistency requirements. 


3. Specificity of rule conditions, by influencing rule competition, allows the system 
to invoke the most relevant rules in each situation. 


4. Default hierarchtes, and the attendant internal models, enable the system to 
make predictions. Subsequent confirmation or falsification of the predictions 
enables Ше inductive mechanisms to make rclevant revisions of structure, in 
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the absence of reinforcement. Default hierarchies emerge naturally under the 


combined action of specificity-biased competition and strength-biased rule gen- 
eration. 


5. Coupling of rules provides for association, sequential action, and plans. It also 
provides the “bridges” for the bucket-brigade credit assignment algorithm. Cou- 
plings are easily generated and modified by the recombinations provided by the 
rule generation procedure. 


6. Support integrates the partial and circumstantial evidence provided by rules 
that respond to particular aspects of the overall situation. ' 


The bucket brigade algorithm for credit assignment is designed to work with 
local information in a highly parallel system in a way that gets credit to stage- 
setting rules. It treats the overall system as a complex economy in which each rule 
is a “middlemen” depending for survival on its interactions with the rules to which 
it is coupled (its "suppliers" and its "consumers"). 

The rule generation procedure treats strong rules as “parents” producing off- 
spring by recombining parts from the parents. The offspring then replace weak 
rules in the system, acting as new hypotheses to be tested in situations where the 
system docs not have well-established rules. It can be shown that this procedure 
biases rule generation toward the use of components appearing in successful rules. 
This experience-biased choice of building blocks provides new rules that are at least 
plausible on the basis of that experience. 

Each of the mechanisms used by the performance system has been designed 
to enable the system to continue to adapt to its environment while using the ca- 
pabilities it already has to respond, instant by instant, to that environment. In so 
doing, the system is constantly trying to balance exploration (acquisition of new 
information and capabilities) with exploitation (the efficient use of information and 
capabilities already available). The cognitive system that results is well founded in 
computational terms, and it does indeed get better at attaining goals in a perpet- 
ually novel environment. | 


THE PERFORMANCE OF АММ 5 IN ARBITRARY NONLINEAR | 
ENVIRONMENTS 


[See Figures 12 through 16] 
1. In general, ап АКМ” component structures (rules, strategies, chromosomes, 
policies, or the like) can be represented as a collection of k-bit strings (sav, a set 
of classifiers recoded as binary numbers). In the discussion that follows we will 
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assume that each of these strings can be assigned some value, as a measure of its 
contribution to overall performance, as when one assigns fitness to individuals 
in an evolving biological population, or worth to individual corporations In а 
economy, and so on. That is, we will assume that credit has been assigned by 
an appropriate algorithm (such as the bucket brigade algorithm). 


On the basis of this assumption we can model the ANN’s search as a sampling 
of the space of strings {1,0}* using a probability distribution p(t) that changes 
progressively as time t increases. Each z € {1,0}* represents a structure to 
tried, and the function и: {1,0}* — Reals determines the value u(z) returned 
when z is tried. For an ANN, u(z) will be a complicated nonlinear function. The 
evaluation of a single = will be a time-consuming task as, for example, when т 
is a strategy for playing a game. Here we are only concerned with conditions 
under which the information returned—the value u(z) of the structure z—will 
be helpful in biasing the distribution p(t) that directs the search of {1,0}*. 


Complex adaptive systems typically exist far from any 
attractor because fitness is not additive over genes: \ 


For additive fitness, 


k 
f(x) - 21-1 469). 


if there are 2 alleles (variants) per locus, 2k trials 
. establish the global optimum. 


For epistatic fitness ( f(x) nonlinsar with no useful 
1.m.s. estimator ), o(2k) trials are required. 


k > 50 = improvements vill continue to be found over 
all physically feasible time-spans. 


Typical genotypes involve k » 104 genasi 


FIGURE 12 Effects of nonlinearity. 
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SOME SCHEMAS [HYPERPLANES ON (0,1)*] 








Li-0-0- _.. = 


— [1100010 ... 


Partitions defined oa loci 1 аы Z 


Schemas as subintervals of tha real line interval [0,1] 
11100010 ... 
+ 1 


9 .D10...0 .109...0 .1109...0 





FIGURE 13 Some schemas (hyperplanes on (1,0)*). 


The information accumulated by sampling u's argument space (1,0)* can be 
more transparently related to possibilities for further biasing p(t) if u is re- 
represented using a hyperplane transform. The hyperplane transform uses the 
fact that, under the distribution p(t), the function u is a random variable and 
subsets of the argument space {1,0}* are events having well-defined ezpec- 
tations. The hyperplane transform uses the expectations of selected 515 of 
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hyperplanes in {1,0}* to re-represent u. It can be shown that this transform 
provides a unique, invertible representation for any finite, nonlinear function u. 


3. Biasing a search toward (or away from) some subset of (1,0) if it is fo 
be information based, requires an estimate that elements of that subset are, 
on average, better (worse) than elements elsewhere. Concentrating on hyper- 
planes, we note that hyperplanes of higher dimension, being larger subsets of 
{1,0}*, typically receive a larger fraction of any set of samples drawn from 
(1,0]*. As a consequence, estimates of the expectation u(s) associated with 
a higher-dimensional hyperplane s will be confirmed faster than similar esti- 
mates for lower-dimensional refinements of s. Accordingly, as the number of 
samples increases, biases should proceed from biases based on estimates for 


high-dimensional hyperplanes to biases involving lower-dimensional refinements 
of those hyperplanes. 


Schemas as Sample Spaces. 


Given: A probability distribution p defined over (1,0)E, such as that 
induced by a genatic algorithm 


The fitness ш: (1,0)E — R* becomes a random variable, and schemas 
become events in the sample трасе. 


ules е) - пес === 





Under р, the function ч hes а well-defined average value u(s) on each 
schema s. 


| Е loci —4 
зеј• ОО ** _* е (10 


defining 1 | | 
loci = 


Each selection of a zat of loci in the string defines а unique partion of 


us y consisting of the set of hyperplanes that can be defined on that set 
of loa. 


Each such partition can be azzigned a unique index by the device 
= ."*"dd*4* — 0...0011010 = (2614 n 


FIGURE 14 Schemas as sample spaces. 
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4. The problem, then, is to design a feasible algorithm that, as information accu-~ 
mulates, provides the biases suggested by the hyperplane transform. It is easy 
to see that, for large k, it is not feasible to carry out an explicit calculation 
of the hyperplane transform each time P, is changed. However, we have seen 
that genetic algorithms rapidly provide the biasing implied by the hyperplane 
transform without explicitly carrying out the calculations involved. 


It is worth contrasting the effects of mutation with the effects of crossover in 
propagating the ANN's search: 

Mutations of loci contiguous to the defining loci of an above-average schema 
$ can provide instances of schemas not previously present in the population. Each 
such schema, s’, is a refinement of s (i.e., s contains s^) and hence is an element of a 
deeper level partition. Consider, then, a partition of s comprised of 2^ hyperplanes 
obtained by specifying the valucs for some sct оГ h loci contiguous to s. Because 
the mutations are allocated randomly, the number of samples n(s') of s' will be 
approximately 27^n(s), where n(s) is the number of samples allocated to s. The 
central limit theorem assures that the averages of different sets of samples of any s' 
will be distributed approximately as a Gaussian distribution, whatever the proba- 
bilities assigned to the elements г in 5'. The sampling process (mutation operator) 
thus produces an estimate of u(x’) with a variance that decreases as „/2“ “п(5). 
Very rouglily, then, one would expect to reliably discover s‘ for which u(s’) > u(s) 
at a rate on the order of \/2~"n(s). In other words, mutation will discover improve- 
ments in the vicinity of s at a rate that falls off as the square root of the number 
of samples allocated to s. In biological terms, this would correspond to an adaptive 
rudiaíion wherein variants of the prototype s provide incremental improvements. 

This process of “exploring the neighborhood” via mutations contrasts sharply 
with the jumps produced by crossover. To develop the contrast, consider a ran- 
domly generated population with instances of two above-average schemas s, and 
$2, where 4(51) > d(s2). Let the defining bits of 5; and 55 be such that there 
are no instances of the schema s designating the intersection of s, and 52, and let 
u(s) > max (u(si), u(s2)}. 

Under these conditions, on the order of d(s2)/2 mutations must accumulate in 
some instance of s, before an instance of s appears in the population. The mutation 
rate can of course be increased to make the accumulation more rapid, but only at 
the cost of making it increasingly unlikely that s will be “copied” into successive 
generations (sce Figures 17 and 18). Mutations tend to explore in a linear way—the 
depth of the exploration is a linear function of the number of gencrations elapsed. 
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The Hyperpiane Transform. 


Each partition j defined by a set of defining loci can be assigned a 
unique “discrepancy” bi. 

(where 6; is roughly а measure of the deperture from the overall 
maan of the averages ascociated with the раг ов = elements). 


u(x) = TRAXSTOUR M — 5, 





Tha collection of 6 j uniquely defines the function u and its average 
valos u(t) on any cchema 5: ` 


ws) = Sat (PEDE, 22,2, ZOE) Sj 


where 
50 = ч(ее,,..), the expectation of u under р 
4(ғ) = (no. of defining loci for з) - (no. of defining loci for s`) 


G(s' ) = +1 17 к’ has an even no. of defining O'r 
--1 othervise 


FIGURE 15 The hyperplane transform. 


On the other hand, a single crossover between parents that arc instances of 51 
and 8, respectively, can yield an instance of s. That is, an above-average schema s at 
depth 20 that falls in the intersection of established schemas s; and 5: at depth b can 
be discovered in a single generation. This doubling of depth in successive generations 
comes about whenever established schemas can be combined as "building blocks" 
to yield improved schemas. Under these conditions, crossover explores witli directed 
exponential increases in depth. 
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Biased Sampling. 


A Genetic Algorithm generates a biased sample of the schemas (byper- 
lanes) s according to the relation 
p(s,t+1) = [u(s,t)/u(t)]p(s, t), 

subject to “errors” introduced by crossover, mutation, etc. 


A hyperplane at level j (| defining loci) is sampled once in every 2) 
trials when the sample points are uniformly distributed. 


At first, a population under a genetic algorithm stores information 
about u(s) for s at LEVEL(s) « log2h, 
where N - (no. generauons)x(generation size]. 


As biasing progresses, recombination of schamas under crossover 
allows the population to accumulate information about schemas for 
LEVEL(s) >> log2™. 


Лете. 8 Q a = 
“Radiation” = generation of new rechema by > 
mutations of rchemar with large 
' e» c» numberrz of instances in the 
ыы 029 population (increases the Level 


by asmali integer) 






lewi Д аа а 


= = = 

"Saltation^ е generation of new schemas by 
recombination of schemas with 
large numbers of instances in 
the populauon (roughbiy, doubles 
the level) 


а ... Су oe 


If the non-zero d's under tha Hyperplane Transform are sparse, then 
the Genetic Algorithm will exhibit substantial periods of "radiation" 
punctuated by occasional “saltations” when a deeper non-zero 4 із 
discovered by recombination. 


FIGURE 16 Biased sampling. 


During the time that a schema has a large number of instances in the popula- 
tion, the genetic algorithm acts to provide many new instances of it. The mutation 
operator treats the schema as a focal point, generating a variety of new instances 
in its “neighborhood.” Crossover, and other forms of recombination, generate new 
instances by treating the schema as a “building block” that can be used in combi- 
nation with other building blocks. 

The fate of a newly discovered instance of a deep, above-average schema s 
depends upon the manner of its discovery. Consider a schema s with a length /(s) 


‚61 


62 


496 John H. Holland 


Searching for Better Schemas I. Е 


l— + j — 
... —— E j +e 
E g* 


s, a valuable schema (e.g., the current best) 
TCR түзе.” 
—9m bits|— 
55, a more valuable refinement of s, u(ss') > u(s) 


А са 1 0.1 10 E nee _ 
— |j bits|-— 


Assume all other refinements 55” of s involving j < j bits are | 
considerably less valuable than s, u(ss'") ««u(s). 1 


l.e., $ is surrounded by a "desert", 


FIGURE 17 Searching for better schemas |. 


large enough to indicate a large crossover error. Under normal circumstances, this 
crossover error would quickly destroy instances of s. However, if s has been discov- 
ered by recombination of well-established building blocks, things happen differently. 
The well-established building blocks occupy large fractions of the population, so the 
parents are likely to hold several building blocks in common. As a result, crosses that 
normally would break up instances of s now just recreate s because they exchange 
pieces of identical building blocks. Thus, s increases its representation despite the 
large crossover error. 

Generally, crossover is the operator that can be expected to vield substantial 
improvements based on deeper 675. Crossover implements the heuristic that “good” 
structures are constructed of “good” building blocks (cf. Simon’s® discussion of 
the architecture of complexity). This amounts to a conjecture that new non-zero 
675 are associated with the intersections of hyperplanes already known to be as- 
sociated with non-zero 6'5. Of course, the conjecture may prove untrue for many 
intersections, but it need only be true upon occasion for improvements to be made. 

It should be recalled that improvement is the object of the search; the global 
optimum may involve 675 so deep that they will never be uncovered in feasible times. 
Implicit parallelism, by assuring that the genetic algorithm usefully searches large 
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Searching for Better Schema: II. 


а ——— 1 —4 
t.. ИН M «=» = 
£ z^ 


Let е be an upper bound on the probability that an instance of a schema 
will be broken up by a genatic operator, i.e., e sats а bound on tha Jos 
of шагпмйоп. 
SEARCH TIME T nut UNDER MUTATION: 
Under mutation the expected schema loss is —(m+j)P im 
=> Рули time). 


To reach 55° from s requires )/2 (expected) simultaneous mutations (апу 
smaller no. falls in the lethal "desert") 


=> The probability of reaching ss” under mutation is 
Pra) «(mei 
=> Expected search lime is : 
Imut* (mere)? 
SEARCH TIME Teross UNDER CROSSOVER: 

Under crossover the expected schema loss is є (m+j)/k 

=> е с (еј oe e/(m+j) = ИХ 
The suffix s" occurs with probability 27] (assuming а uniform random 
distr.), and ii will be attached to s under crossover only if а cross occurs 
exactly at the junction of s and 5". 


== Tha probability of reaching sr’ under crotsover is 
(1%)(27-1) < (e/(mej))21 
=> Expected search time 15 


Teross = ((mjya)2? 


For e-0.1, m-16, |-4: Taass” Т 553 


FIGURE 18 Searching for better schemas Ц. 


numbers of schema combinations in each successive generation, makes it likely that 
some useful intersections will be uncovered. It is possible to design a function и 
that often “guides” the genetic algorithm away from good regions, but it is hard 
to design a function that keeps the algorithm away from improvements over long 
intervals. 

Overall, and in qualitative terms, the search of a complex nonlinear function 
with non-zero 6’s at many levels typically exhibits continual small improvements, 
punctuated by saltations to schemas involving deeper 6’s. This beliavior is a direct 
consequence of the manner in which genetic operators exploit sparse deeper 6's. 
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From a biological perspective , it is interesting that this succession of “punctuated 
equilibria” occurs without the intervention of higher-order selection principles. 
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TOWARD A THEORY OF 
INTERDEPENDENT DECISION 


On the strategy of pure conflict — the zero-sum games — 
game theory has yielded important insight and advice. But on 
the strategy of action where conflict is mixed with mutual de- 
pendence — the nonzero-sum games involved in wars and threats 
of war, strikes, negotiations, criminal deterrence, class war, race 
war, price war, and blackmail; maneuvering in a bureaucracy 
or in a traffic jam; and the coercion of one's own children — 
traditional game theory has not yielded comparable insight or 
advice. These are the *games" in which, though the element of 
conflict provides the dramatic interest, mutual dependence is 
part of the logical structure and demands some kind of collabora- 
tion or mutual accommodation — tacit, if not explicit — even if 
only in the avoidance of mutual disaster. These are also games in 
which, though secrecy may play a strategic role, there is some 
essential need for the signaling of intentions and the meeting of 
minds. Finally, they are games in which what one player cam 
do to avert mutual damage affects what another player will do to 
avert it, so that it is not always an advantage to possess initia- 
tive, knowledge, or freedom of choice. 

Traditional game theory has, for the most part, applied to 
these mutual-dependence games (nonzero-sum games) the meth- 
ods and concepts that proved successful in studying the strategy 
of pure conflict. The present chapter and the one to follow 
attempt to enlarge the scope of game theory, taking the zero-sum 
game to be a limiting case rather than a point of departure. The 
proposed extension of the theory will be mainly along two lines. 
One is to identify the perceptual and suggestive element in the 
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1 The Rules of the Game 


1.1 Basic Definitions 


Game theory is concerned with the actions of individuals who are conscious that 
their actions affect each other. When the only two publishers in a city choose prices 
for their newspapers, aware that their sales are determined jointly, they are players 
in a game with each other. They are not in a game with the readers who buy the 
newspapers, because each reader ignores his effect on the publisher. Game theory 
is not useful when decisions are made that ignore the reactions of others or treat 
them as impersonal market forces. The best way to understand which situations 
can be modelled as games is to think about examples. Consider the following: 


(1) OPEC members choosing their annual output. 

(2) General Motors purchasing from US Steel. 

(3) Two manufacturers, one of nuts and one of bolts, deciding whether to use 
metric or American standards. 

(4) <A board of directors setting up a stock option plan for the Chief Executive 
Officer. 

(5) United Fruit Company hiring workers in Honduras in the 1930s. 

(6) An electric company deciding whether to order a new power plant given its 
estimate of demand for electricity in ten years. 


The first four examples are games. (1) OPEC members are playing a game 
because Saudi Arabia knows that Kuwait’s oil output is based on Kuwait’s forecast 
of Saudi output, and the output from both countries matters to the world price. (2) 
A significant portion of American trade in steel is between General Motors and US 
Steel, companies which realize that the quantities traded by each of them affect the 
price. One wants the price low, the other, high. (3) The nut and bolt manufacturers 
are not in conflict, but the actions of one affects the desired actions of the other. 
(4) The board of directors chooses a stock option plan anticipating the effect on 
the actions of the CEO. | 

Game theory is inappropriate for modelling the final two examples. (5) Each 
individual worker affects United Fruit insignificantly, and each worker makes his 
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employment decision without regard for the impact on United Fruit’s behavior. (6) 
The electric company faces a complicated decision, but it does not face another 
rational agent. Changes in the important economic variables could turn examples 
(5) and (6) into games. The appropriate model changes if United Fruit faces a 
plantation workers’ union or if the public utility commission pressures the the utility 
to change its generating capacity. 

Game theory as it will be presented in this book is a modelling tool, not an 
axiomatic system. The presentation in this chapter is unconventional. Rather than 
starting with mathematical definitions, or simple little games of the kind used later 
in the chapter, we will start with a situation to be modelled, and build a game from 
it step by step. 


Describing a Game 


The essential elements of a game are players, actions, information, strategies, 
payoffs, outcomes, and equilibria. At a minimum, the game’s description must 
include the players, strategies, and payoffs, for which the actions and information 
are building blocks. The players, actions, and outcomes are collectively referred to 
as the rules of the game, and the modeller’s objective is to use the rules of the 
game to determine the equilibrium. 

We will define the terms using a game we will call OPEC Model I as an example. 


The players are the individuals who make decisions. Each player’s goat; is to 
maximize his utility by choice of actions. 


In OPEC Model I, we specify the players to be Saudi Arabia (S) and Others 
(O) (referring to the other members of OPEC). Let us assume that each player's 
utility is the sum of his oil revenues in 1988 and 1989. Passive individuals like 
the American consumer, who react predictably to oil price changes without any 
thought of trying to change anyone's behavior, are not players, but environmental 
parameters. Sometimes it is useful to explicitly include individuals in the model 
called non-players whose actions are taken in a purely mechanical way. 


Nature is a non-player who takes random actions at specified points in the 
game with specified probabilities. 


In OPEC Model I, we will assume that the strength of world demand for oil, 
denoted D, can take one of two permanent values. At the beginning of the game, 
Nature randomly decides whether oil demand will be Weak or Strong, assigning, let 
us assume, probabilities of 70 and 30 percent. Even if the players always took the 
same actions, this random move means that the model would yield more than just 
one prediction. We say that there are different realizations of à game depending 
on the results of random moves. 


An action or move by player i, denoted a;, is a choice he can make. 


Player is action set A; = {a;} is the entire set of actions available to him. 
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An action combination 15 an ordered set a = {a;}, (i = 1,...,n) of one 
action for each of the n players in the game. 


For our model we specify the same action sets for both Saudi Arabia and Others: 
an oil output that is either High or Low in each year. We will use the notation 
Q.ountry,year = level: If Saudi output in 1988 is High, we say that Qs = Н. 

Besides specifying the actions available to a player, we must specify when they 
are available: the order of play. We will say that a country chooses its outputs 
afresh each year rather than choosing both years’ outputs at the start of the game. 
The order of play is therefore 


(0) Nature picks demand, D, to be Weak or Strong. 
(1) Saudi Arabia and Others simultaneously choose their individual 1988 outputs 
from the action sets 
(ава = Г, Сав = H} and {Оов = Г, Qos = Н}. 
(2) Saudi Arabia and Others simultaneously choose their individual 1989 outputs 
from the action sets 
(Qs. = L, Qs, = Н} and {Оо = Г, Qo.» = H}. 


An alternative specification, appropriate if the technology of oil production re- 
quired advance planning, is that a country chooses its output for both years at the 
start of the game. Then the order of play would have just two elements: 


(0) Nature picks demand, D, to be Weak or Strong. 
(1) Saudi Arabia chooses its individual 1988 and 1989 outputs from the action 
set 


(Оза = L,Qs,9 = L), (Qs = 1,05 = Н), 
(Qs, = Н, 05, = Г), (Qs, = H,Qs,9 = Н) | | 


Others simultaneously chooses actions from its equivalent action set. 


Information is modelled using the concept of the information set, which we 
will define precisely in Section 2.3. For now, think of a player’s information set as 
his knowledge at a particular time of the values of different variables. The elements 
of the information set are the different values that the player thinks are possible. 
If the information set has many elements, there are many values the player cannot 
rule out; if it has one element, he knows the value precisely. Let us specify that 
after Nature moves, Saudi Arabia knows whether world oil demand is Strong or 
Weak, but Others cannot rule out either possibility. The information sets are 


Others: {D = Strong, D = Weak}; 
Saudi Arabia: {D = Strong} or {D = Weak}, depending on demand. 


A player’s information set includes not only distinctions between the values of 
variables like the strength of oil demand, but also knowledge of what actions have 
previously been taken, so his information set changes over the course of the game. 
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Player i's strategy s; is a rule that tells him which action to choose at each 
instant of the game, given his information set. 


Player i's strategy set or strategy space S; = {s;} 25 the set of strategies 
available to him. 


A strategy combination s = (81,...,5,) is an ordered set consisting of one 
strategy for each of the n players in the game. 


Since the information set includes whatever the player knows about the previous 
actions of other players, the strategy tells him how to react to their actions. In 
the OPEC game the actions are to produce High or Low in 1988 and 1989. One 
strategy in Saudi Arabia's strategy set is 
L if = Weak 
H if D=Strong 


L if D = Weak, Ос = Г, and Qo, = [, 
Qs,9(D, Qs.a, Qo 8) == | 5,8 O,8 
.H otherwise 


Qs. (D) i { 


This strategy gives Saudi Arabia’s 1988 action as a function of the strength of 
demand alone (since Others’s action is not yet known), and its 1989 action as a 
function of demand, its own 1988 action, and Others’s 1988 action. A strategy is 
a function only of observed history, not of current actions or of another player’s 
strategy. Saudi Arabia’s strategy cannot be specified to give its 1989 action as 
a function of Others’s 1989 action or Others’s strategy. Such misspecification is 
a common source of confusion. In the simple games of the next few sections the 
distinction between actions and strategies is not important, but in later chapters 
it will be quite helpful. The concept of the strategy is useful because the action a 
player wishes to pick depends on the past actions of Nature and the other players. 
Only rarely can we predict a player’s actions unconditionally. More often we can 
predict how he will respond to the outside world. 

Another source of confusion is that a player’s strategy is a complete set of instruc- 
tions for him, which tells him what actions to pick in every conceivable situation, 
even if he does not expect to reach that situation. Strictly speaking, even if a 
player’s strategy instructs him to commit suicide in 1989, it ought to also specify 
what actions he takes if he is still alive in 1990. Besides being necessary to fit 
the definition, this kind of carefulness will be important when we look at subgame 
perfect equilibrium in Chapter 4. The completeness of the description also means 
that strategies, unlike actions, are unobservable. A strategy is mental; an action is 
physical. 


By player i's payoff 7,(51,...,54), we mean either: 


(1) The utility he receives after all players and Nature have picked their strate- 
gies and the game has been played out; or 


(2) The expected utility he receives as a function of the strategies chosen by 
himself and the other players.” 
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Definitions (1) and (2) are distinct and different, but in the literature and this book 
the term “payoff” is used for both the actual payoff and the expected payoff. The 
context will make clear which is meant. 

We assumed above that the payoffs of Others and Saudi Arabia were the sums of 
their oil revenues over the two years of production. If we were not merely sketching 
the model, we would specify т< and то as functions relating the sum of revenues 
to the strength of demand and the outputs in the two years. 


The outcome of the game is a set of interesting elements that the modeller 
picks from the values of actions, payoffs, and other variables after the game 
is played out. 


The definition of the outcome for any particular model depends on what variables 
are interesting to the modeller. One outcome of OPEC Model I is 


5,3 = L, Qs. = Н, Оов = Н, Дог = L, О = Г, ts 100, по = 80, (1.1) 


where 100 and 80 are the values specified by the payoff functions. The outcome 
could be more narrowly defined as just the set of payoffs or the levels of output. 
Which definition is chosen depends on what you think is interesting about OPEC. 
This entire model, in fact, is only one of the many possible models of OPEC. For 
contrast, another game representing OPEC is OPEC Model II. 
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Choosing the suitable model is where much of the talent of a modeller is displayed, 
since he must trade off realism, ease of solution, and clarity of presentation. How 
would you decide between the two OPEC models? 


Equilibrium 

To predict the outcome of a game, the modeller focusses on the possible strategy 
combinations, since it is the interaction of the different players' strategies that 
determines what happens. The distinction between strategy combinations, which 
are sets of strategies, and outcomes, which are sets of values of whichever variables 
are considered interesting, is a common source of confusion. Often different strategy 
combinations lead to the same outcome. In OPEC Model I, the single outcome 


(Qs;s = L, боз = І, Qs, = І, Оо» = L, D = Strong, те = 100, то = 80) 
(1.2) 
is produced by either of the two strategy combinations, the Golden Rule or An Eye 
for an Eye. 


The Golden Rule: (Low output no matter what happens) 


lota Arabia: (Qs. = L;Qss-L), 
Others : (Оов = L; Фо» = L); 


An Eye for an Eye: (Retaliate) 


Saudi Arabia: (Ога= L; Qs. = L if Оов = Г, Qs. = Н otherwise), 
Others : (Оов = L;Qo,s = L if Qs. = І, Qo,s = Н otherwise). 


Under the Golden Rule, both players always choose low output, so output is low 
in both years. Under An Eye for an Eye (a variant of the "tit-for-tat" strategy of 
Section 4.6), both players choose low output in 1988, and in 1989 each chooses the 
output the other had chosen the previous year, so output is also low. 


An equilibrium 5" = (sj,...,8%) is a strategy combination consisting of a 
best strategy for each of the n players in the game. | 


The equilibrium strategies are the strategies players pick in trying to max- 
imize their individual payoffs, as distinct from the many possible strategy com- 
binations obtainable by randomly choosing one strategy per player. Equilibrium 
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is used differently in game theory than in other areas of economics. In a general 
equilibrium model, for example, an equilibrium is a set of prices resulting from 
optimal behavior by the individuals in the economy. In game theory, that set of 
prices would be the equilibrium outcome, but the equilibrium itself would be 
the strategy combination—individuals’ rules for buying and selling—that generated 
the outcome. ғ 

То find the equilibrium it is not enough to specify the players, strategies, and 
payoffs, because the modeller must also decide what “best strategy" means. He 
does this by defining a solution concept. 


An equilibrium concept or solution concept F : (8,,...,$4,71,...,74] 
— s* is a rule that defines an equilibrium based on the possible strategy com- 
binations and the payoff functions. 


Only a few equilibrium concepts are generally accepted, and the remaining sections 
of this chapter are devoted to finding the equilibrium using the two best-known 
concepts: dominant strategy and Nash equilibrium. 


Uniqueness 


Because the accepted solution concepts do not guarantee uniqueness, lack of a 
unique equilibrium is a major problem for game theory. Often the solution concept 
employed leads us to believe that the players will pick one of the two strategy 
combinations A or B, not C or D, but we cannot say whether A or B is more likely. 
Sometimes we have the opposite problem and the game has no equilibrium at all; 
by this is meant either that the modeller sees no good reason why one strategy 
combination is more likely than another, or that some player wants to pick an 
infinite value for one of his actions, something we will discuss further in Section 
5.6. 

A model with no equilibrium or multiple equilibria is underspecified. The mod- 
eller has failed to provide a full and precise prediction for what will happen. One 
option is to admit that his theory is incomplete: an admission of incompleteness 
like the Folk Theorem of Section 4.6 is a valuable negative result. Or perhaps the 
situation being modelled really is unpredictable. Another option is to renew the 
attack by changing the game’s description or the solution concept. Preferably it 
is the description that is changed, since economists look to the rules of the game 
for the differences between models, and not to the solution concept, and the reader 
is likely to feel tricked if an important part of the game is concealed under the 
definition of equilibrium. 


1.2 Dominant Strategies: the Prisoner’s Dilemma 


In discussing equilibrium concepts, it is useful to have shorthand notation for “all 
the other players’ strategies.” 


For any vector y = (Y1,---,Yn), denote by y_; the vector (y1,---,Yi-1, 
у:41,:::,Уа), which is the portion of y not associated with player 1. 
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Using this notation, s_;, for instance, is the combination of strategies of every 
player except player i. That combination is of great interest to player 1, because he 
uses it to help choose his own strategy, and the new notation helps define his best 
response. 


Player i's best response or best reply to the strategies s_; chosen by the 
other players is the strategy sj that yields him the greatest payoff, that is, 


т.(8;,8-.) > 1.(8;,8-.) Vs; 7 s. (1.3) 


The best response is strongly best if no other strategies are equally good, and 
weakly best otherwise. 
The first important equilibrium concept is the dominant strategy equilibrium. 


The strategy 8: is a dominant strategy if it is a player's strictly best re- 
sponse to any strategies the other players might pick, in the sense that whatever 
strategies they pick, his payoff is highest with s}. Mathematically, 


п,(5:,5—,) > т:(8;, S i) VS. i, Vs; = 8: (1.4) 
His inferior strategies are dominated strategies. 


A dominant strategy equilibrium is a strategy combination consisting of 
each player’s dominant strategy. 


A player’s dominant strategy is his strictly best response even to very stupid 
actions by the other players. Most games do not have dominant strategies, and the 
players must try to figure out each others’ actions to choose their own. 

In developing OPEC Model I, we incorporated considerable complexity to illus- 
trate such things as information sets and the time sequence of actions. To illustrate 
equilibrium concepts we will use simpler games such as the Prisoner’s Dilemma. In 
the Prisoner’s Dilemma, two prisoners, Messrs Row and Column, are being inter- 
rogated separately. If both confess (Fink) they are each sentenced to eight years 
in prison; if both hold out (Cooperate), each is sentenced to one year. If just one 
confesses, he is released, but the other prisoner is sentenced to ten years. The 
Prisoner’s Dilemma is an example of a 2-by-2 game, because each of the two 
players—Row and Column—has two possible actions in his action set—Fink and 
Cooperate. The payoffs are given by Table 1.1. 

Each player has a dominant strategy. Consider Row. Row does not know which 
action Column is choosing, but if Column chooses Cooperate, Row faces a Cooperate 
payoff of —1 and a Fink payoff of 0, whereas if Column chooses Fink, Row faces a 
Cooperate payoff of —10 and a Fink payoff of —8. In either case Row does better 
with Fink, and since the game is symmetric, Column's incentives are the same. 
The dominant strategy equilibrium is (Fink, Fink), and the equilibrium payoffs are 
(—8,—8), which is worse for both players than (—1, —1). Sixteen, in fact, is the 
greatest possible combined total of years in prison. 

The result is even stronger. Because the equilibrium is a dominant strategy 
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Table 1.1 The Prisoner’s Dilemma 


Column 
Cooperate Fink 
Cooperate  —1,—1 —10,0 
Row 
Fink 0, —10 -8,-8 


Payoffs to: (Row, Column) 


equilibrium, the information structure of the game does not matter. If Column is 
allowed to know Row’s move before taking his own, the equilibrium is unchanged. 
Row still chooses Fink, knowing that Column will surely choose Fink afterwards. 

If the outcome does not seem right to you, you should realize that very often 
the chief usefulness of a model is to induce discomfort. Discomfort is a sign that 
your model is not what you think it is—that you left out something essential to the 
result you expected and didn’t get. Either your original thought or your model is 
mistaken; and finding such mistakes is a real, if painful, benefit of model-building. 

The Prisoner’s Dilemma crops up in many different situations, including oligo- 
poly pricing, auction bidding, salesman effort, political bargaining, and arms races. 
Whenever you observe individuals in a conflict that hurts all of them, your first 
thought should be of the Prisoner’s Dilemma. 


Cooperative and Noncooperative Games 


What difference would it make if the two prisoners could talk to each other before 
making their decisions? It depends on the strength of promises—if promises are 
not binding, then although the two prisoners might agree not to fink, they would 
fink anyway when the time came to choose actions. 


A cooperative game is a game in which the players can make binding com- 
mitments, as opposed to a noncooperative game, in which they cannot. 


This definition draws the usual distinction between the two theories of games, 
but the real difference lies in the modelling approach. Both theories start off with 
the rules of the game, but they differ in the kinds of solution concepts employed. 
Cooperative game theory is axiomatic, frequently appealing to Pareto-optimality 
(see note N1.3), fairness, and equity. Noncooperative game theory is economic 
in flavor, with solution concepts based on players maximizing their own utility 
functions subject to stated constraints. Except for Section 10.2 of the chapter on 
bargaining, this book is concerned exclusively with noncooperative games. 

In applied economics, the most commonly encountered use of cooperative games 
is to model bargaining. The Prisoner’s Dilemma is a noncooperative game, but it 
could be modelled as cooperative by allowing the two players not only to commu- 
nicate, but to make binding commitments. Cooperative games often allow players 
to split the gains from cooperation by making side-payments—transfers between 
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themselves that change the prescribed payofis. Cooperative game theory gener- 
ally incorporates commitments and side-payments via the solution concept, which 
can become very elaborate, while noncooperative game theory incorporates them 
by adding extra actions. The distinction between cooperative and noncooperative 
games does not lie in conflict or absence of conflict, as is shown by the following 
examples of situations commonly modelled one way or the other: 

A cooperative game without conflict: Members of a work force choose which 
of equally arduous tasks to undertake to best coordinate with each other. 


A cooperative game with conflict: Bargaining over price between a monop- 
olist and a monopsonist. 


A noncooperative game with conflict: The Prisoner’s Dilemma. 


A noncooperative game without conflict: Two companies set a product 
standard without communication. 


1.3 Iterated Dominance: the Battle of the Bismarck Sea 


Very few games have a dominant strategy equilibrium, but the Battle of the Bis- 
marck Sea shows how dominance can be used even when it does not resolve things 
quite so neatly as in the Prisoner’s Dilemma. The game is set in the South Pacific 
in 1943. Admiral Imamura has been ordered to transport Japanese troops across 
the Bismarck Sea to New Guinea, and Admiral Kenney wishes to bomb the troop 
transports. Imamura must choose between a shorter Northern route or a longer 
Southern route, and Kenney must decide where to send his planes to look for the 
Japanese. If Kenney sends his planes to the wrong route he can recall them, but 
the number of days of bombing is reduced. 

The players are Kenney and Imamura, and they each have the same action set, 
{ North, South}, but their payoffs, given by Table 1.2, are never the same. Imamura 
loses when Kenney gains. Because of this feature, the payoffs could be represented 
using just four numbers instead of eight, but listing all eight payoffs in Table 1.2 
saves the reader a little thinking. 


Table 1.2 The Battle of the Bismarck Sea 


Imamura 
North South 
North 2,-2 2,-2 


Kenney 
South 1,-1  3,—3 


Payoffs to: (Kenney, Imamura) 
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Strictly speaking, neither player has a dominant strategy. Kenney would choose 
North if he thought Imamura would choose North, but South if he thought Imamura 
would choose South. Imamura would choose North if he thought Kenney would 
choose South, and he would be indifferent between actions if he thought Kenney 
would choose North. But we can still find a plausible equilibrium, using the concept 
of weak dominance, which differs from strong dominance only by replacing a strong 
inequality with a weak one. 


Strategy s; is a weakly dominant strategy if it is a player's best response 
to any strategies the other players might pick, in the sense that whatever strate- 
gies they pick, his payoff is no smaller with s; than with any other strategy, 
and is greater in some strategy combination. Mathematically, 


т.(8;,8-.) > т.(5;,8-.) Vs. Vs}, 


and л;(5;,5_{) > т.(8:,8-;) Vsi, for some s. ;. (1.5) 


An iterated dominant strategy equilibrium is а strategy combination 
found by deleting a weakly dominated strategy from the strategy set of one of 
the players, recalculating to find which remaining strategies are weakly domi- 
nated, deleting one of them, and continuing the process until only one strategy 
remains for each player. 


Admiral Imamura does have a weakly dominant strategy. North is weakly domi- 
nant, because Imamura’s payoff from North is no lower than his payoff from South, 
and is greater if Kenney picks South. Suppose that Kenney realizes this and decides 
that Imamura will pick North, deleting “Imamura chooses South” from consider- 
ation. Having deleted one row of Table 1.2, Kenney now has a strongly domi- 
nant strategy (where by “strongly” we mean that this strategy achieves payoffs 
strictly greater than any others), and he chooses North. The strategy combination 
(North,North) is an iterated dominant strategy equilibrium, and despite all the 
qualifying adjectives it seems a good prediction. And indeed (North,North) was the 
outcome in 1943. | 

It is interesting to consider modifying the order of play or the information 
structure in the Battle of the Bismarck Sea. If Kenney moved first, rather than 
simultaneously with Imamura, (North,North) would remain an equilibrium, but 
(North, South) would also become one. The payoffs would be the same for both 
equilibria, but the outcomes would be different. 

If Imamura moved first, then (North,North) would be the only equilibrium. 
Imamura moving first is also equivalent to both players moving simultaneously 
when both know that Kenney has cracked the Japanese code and knows Imamura's 
plan. In both situations, Kenney's information set becomes either {Imamura moved 
North) or {Imamura moved South}, so Kenney’s equilibrium strategy is specified 
as (North if Imamura moved North, South if Imamura moved South). 

Although the 2-by-2 games in this chapter may seem facetious, they are simple 
enough to adapt to economic situations. The Battle of the Bismarck Sea, for 
example, can be turned into a game of corporate strategy. Two firms, Kenney 
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es ii 
and Imamura, are trying to maximize their shares of a market of constant size 
by choosing between the two product designs North and South. Kenney has a 
marketing advantage, and would like to compete head-to-head, while Imamura 
would rather carve out its own niche. The equilibrium is ( North, North). 

The Battle of the Bismarck Sea is special because the payoffs of the players 
always sum to zero. This feature is important enough to deserve a name. 

A zero-sum game is a game in which the sum of the payoffs of all the 

players is zero whatever strategies they choose. À game which is noi zero-sum 

is non-zero-sum. 


In a zero-sum game, what one player gains, another player must lose. The Battle 
of the Bismarck Sea is a zero-sum game, but the Prisoner's Dilemma and OPEC 
Models I and П are not. Although zero-sum games have fascinated game theorists 
for many years, they are uncommon in economics. One of the few examples is the 
bargaining game between two players who divide a surplus, but even this is usually 
modelled nowadays a8 a non-zero-sum game in which the surplus shrinks as the 
players spend more time deciding how to divide it. 


1.4 Nash Equilibrium: Boxed Pigs, the Battle of the 
Sexes, and Pure Coordination 


For the vast majority of games, which lack even iterated dominant strategy equi- 
libria, we use Nash equilibrium, the most important and widespread equilibrium 
concept. To introduce Nash equilibrium we will use the game Boxed Pigs (Baldwin 
& Meese [1979]). Two pigs are put into a Skinner box with a special panel at one 
end and a food dispenser at the other. When the panel is pressed, at a utility cost 
of 2 units, 10 units of food is dispensed. One pig is “dominant” (let us assume 
he is larger), and if he gets to the dispenser first, the other pig will only get his 
leavings, worth 1 unit. The small pig does somewhat better if he gets there first, 
eating 4 units of food, and even if they arrive at the same time he can eat 3 units. 
Table 1.3 summarizes the payoffs for the strategies Press the panel and Wait by 
the dispenser. 


Table 1.3 Boxed Pigs 


Small Pig 
Press Wait 
Press . 5,1 4,4 


Large Pig 
Wait 9,—1 0,0 


Payoffs to: (Large Pig, Small Pig) 


Boxed Pigs has no dominant strategy equilibrium, because what the large pig 
chooses depends on what he thinks the small pig will choose. If he believed that 
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the small pig would press the panel, the large pig would wait by the dispenser, but 
if he believed that the small pig would wait, the large pig would press. There does 
exist an iterated dominant strategy equilibrium, (Press, Wait), but we will employ 
а different line of reasoning to justify that outcome. 

The equilibrium concept that is standardly employed is Nash equilibrium, which 
is less obviously correct than dominant strategy equilibrium, but more often appli- 
cable. Nash equilibrium is so widely accepted that the reader can assume that if a 
model does not specify which equilibrium concept is being used it is Nash. 


The strategy combination 5" is a Nash equilibrium if no player has incen- 
tive to deviate from his strategy given that the other players do not deviate. 
Formally, 


Vi, mils, SLi) > т.(8,,87.), Vsi- (1.6) 


The strategy combination (Press, Wait) is a Nash equilibrium. The way to 
approach Nash equilibrium is to propose a strategy combination and test whether 
each player's strategy is a best response to the others' strategies. If the large pig 
picks Press, the small pig, who faces a choice between a payoff of 1 from pressing 
and 4 from waiting, is willing to wait. If the small pig picks Wait, the large pig, 
who has a choice between a payoff of 4 from pressing and 0 from waiting, is willing 
to press. This confirms that (Press, Wait) is a Nash equilibrium, and in fact it is 
the unique Nash equilibrium. 

The pigs in this game have more thinking to do than the players in the Prisoner's 
Dilemma. They have to realize that the only set of strategies which is supported 
by self-consistent beliefs is (Press, Wait). The definition of Nash equilibrium lacks 
the extra "Vs |” of dominant strategy equilibrium, so a Nash strategy need only 
be a best response to the other Nash strategies, not to all possible strategies. And 
although we talk of *best responses," the moves are actually simultaneous, so the 
players are predicting each others’ moves. If the game were repeated, or the players 
communicated, Nash equilibrium would be especially attractive, because it is even 
more compelling that beliefs should be consistent. 

Every dominant strategy equilibrium is a Nash equilibrium, but not every Nash 
equilibrium is a dominant strategy equilibrium. If a strategy is dominant it is a 
best response to any strategies the other players pick, including their equilibrium 
strategies. If a strategy is part of a Nash equilibrium, it need only be a best 
response to the other players’ equilibrium strategies. This is shown by the Nash 
Puzzle in Table 1.4, which has (Up, Left) and (Down, Right) as Nash equilibria, 
and (Down, Right) as an iterated dominant strategy equilibrium. Consider the 
Nash equilibrium (Up, Left). Neither Smith nor Brown has incentive to unilaterally 
deviate from it; Smith's payoff would remain 0, and Brown's would drop from 1 to 
0. Despite forming part of a Nash equilibrium, Up is a weakly dominated strategy 
for Smith: it yields him either 0 or —2, depending on what Brown does, while 

Down yields him 0 or —1. But if Smith expects Brown to choose Left, as he does 
in equilibrium, there is no reason for Smith to prefer Down. In fact, the Nash 
equilibrium (Up, Left) Pareto-dominates (Down, Right); it is preferred by both 
players. 

Like a dominant strategy equilibrium, a Nash equilibrium can be either weak 
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Table 1.4 The Nash Puzzle 


Brown 
Left Right 
Up 0,1 -2,0 


Smith 
Down 0,-1 -1,0 


Payoffs to: (Smith, Brown) 


or strong. The definition above is for a weak Nash equilibrium. To define the 
strong Nash equilibrium, make the inequality strict; that is, say that no player is 
indifferent between his equilibrium strategy and some other strategy. 


The Battle of the Sexes 


The third game we will use to illustrate Nash equilibrium is the Battle of the Sexes, 
a conflict between a man who wants to go to a prize fight and a woman who wants 
to go to a ballet. While selfish, they are deeply in love, and would, if necessary, 
sacrifice their preferences to be with each other. Less romantically, their payoffs 
are given by Table 1.5. 


Table 1.5 The Battle of the Sexes 
Woman 
Prize Fight Ballet 


Prize Fight 2,1 —1,-1 
Man 
Ballet —5,—5 1.2 


Payoffs to: (Man, Woman) 


The Battle of the Sexes does not have an iterated dominant strategy equilib- 
rium. It has two Nash equilibria, one of which is the strategy combination (Prize 
Fight,Prize Fight). Given that the man chooses Prize Fight, so does the woman; 
given that the woman chooses Prize Fight, so does the man. The strategy combi- 
nation (Ballet, Ballet) is another Nash equilibrium, by the same line of reasoning. 

How do the players know which Nash equilibrium to choose? Going to the fight 
and going to the ballet are both Nash strategies, but for different equilibria. Nash 
equilibrium assumes correct and consistent beliefs. If they do not talk beforehand, 
the man might go to the ballet and the woman to the fight, each mistaken about 
the other's beliefs. But even if the players do not communicate, Nash equilibrium is 
sometimes justified by repetition of the game. If the couple do not talk, but repeat 
the game night after night, one may suppose that eventually they settle on one of 
the Nash equilibria. 

Each of the Nash equilibria in the Battle of the Sexes is Pareto-efficient; that is, 
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no other strategy combination increases the payoff of one player without decreas- 
ing that of the other. In many games the Nash equilibrium is not Pareto-efficient: 
(Fink, Fink), for example, is the unique Nash equilibrium of the Prisoner’s Dilemma, 
although its payoffs of (—8, —8) are Pareto-inferior to the (—1, —1) generated by 
(Cooperate, Cooperate). And we just saw in the Nash Puzzle that one Nash equi- 
librium can Pareto-dominate another. 

Who moves first is important in these games. In the Nash Puzzle, if either player 
moved first we would expect (Up, Left) to be the equilibrium. In the Battle of the 
Sexes, if the man could buy the fight ticket in advance, his commitment would 
induce the woman to go to the fight. But if, instead, the woman could buy the 
ballet ticket in advance, her commitment would induce the man to go to the ballet. 
In many games, but not all, the player who moves first (which is equivalent to 
commitment) has a first-mover advantage. 

The Battle of the Sexes has many economic applications. One is the choice of 
an industrywide standard when two firms have different preferences but both want 
а common standard to encourage consumers to buy the product. A second is to 
the choice of language used in a contract when two firms want to formalize a sales 
agreement but they prefer different terms. 


Pure Coordination 


Sometimes one can use the size of the payoffs to choose between Nash equilibria. 
In the following game, players Smith and Brown are trying to decide whether to 
design the computers they sell to use large or small floppy disks. Both players will 
sell more computers if their disk drives are compatible. The payoffs are given by 
Table 1.6. 


Table 1.6 Pure Coordination 


Brown 
Large Small 
Large 2,2 —1,-1 


Smith 
Small -1,-1 1,1 


Payoffs to: (Smith, Brown) 


The strategy combinations (Small,Small) and (Large, Large) are both Nash equi- 
libria, but (Large, Large) Pareto-dominates (Small, Small). Unlike in the Nash Puz- 
zle, no strategy is weakly dominant. Both players prefer (Large,Large), and most 
modellers would use the Pareto-efficient equilibrium to predict the actual outcome. 
We could imagine that it arises from pregame communication between Smith and 
Brown taking place outside of the specification of the model, but the interesting 
question is what happens if communication is impossible. Is the Pareto-efficient 
equilibrium still more plausible? The question is really one of psychology rather 
than economics. 
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1.5 Focal Points 


Although Thomas Schelling's book The Strategy of Conflict was published almost 
30 years ago, it is surprisingly modern in spirit. Schelling is not a mathematician 
but a strategist, and he examines such things as threats, commitments, hostages, 
and delegation, which we will examine in à more formal way in the remainder of 
this book. He is perhaps best known for his coordination games. Take a moment to 
decide on a strategy in each of the following games, adapted from Schelling, which 
you win by matching your response to those of as many of the other players as 
possible. 


(1) Name Heads or Tails. 

(2) Name Tails or Heads. 

(3) Circle one of the following numbers 7, 100, 13, 261, 99, 666. 

(4) You are to meet somebody in New York City. When? Where? 

(5) You are to split a pie, and get nothing if your proportions add to more than 
100 percent. 


Each of the games above has many Nash equilibria: if I think you will choose 
666, and you think I will choose 666, we both choose it. But to a greater or lesser 
extent they also have Nash equilibria that seem more likely. Certain of the strategy 
combinations are focal points: Nash equilibria which for psychological reasons are 
particularly compelling. Formalizing what makes a strategy combination a focal 
point is hard and depends on the context. In Example (3), Schelling found 7 to 
be the most common strategy, but in a group of Satanists, 666 might be the focal 
point. In repeated games, focal points are often provided by past history. If we 
split a pie once, we are likely to agree on 50:50. But if last year we split a pie in 
the ratio 60:40, that provides a focal point for this year. 

The boundary is a particular kind of focal point. If player Russia chooses the 
action of putting his troops anywhere from one inch to 100 miles away from the 
Chinese border, player China does not react. If he chooses to put troops from one 
inch to 100 miles beyond the border, China declares war. There is an arbitrary 
discontinuity in behavior at the boundary. Another example, quite vivid in its 
arbitrariness, is the rallying cry, “Fifty-Four Forty or Fight!,” which refers to the 
geographic parallel claimed as the boundary by jingoist Americans in the Oregon 
dispute between Britain and the United States in the 1840s. 

Once the boundary is established, it takes on additional significance, because 
behavior with respect to the boundary conveys information. When Russia crosses 
an established boundary, that tells China that Russia intends to make a serious 
incursion further into China. Boundaries must be sharp and well-known if they 
are not to be violated, and a large part of both law and diplomacy is devoted to 
clarifying them. Boundaries can also arise in business: two companies producing 
an unhealthful product might agree not to mention relative healthfulness in their 
advertising, but a boundary rule like “Mention unhealthfulness if you like, but do 
not stress it,” would not work. 


1 The threat was not credible: that parallel is now deep in British Columbia. 
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Mediation and communication are both important in the absence of a clear 
focal point. If players can communicate, they can tell each other what actions they 
will take, and sometimes, as in Pure Coordination, this works, because they have 
no motive to lie. If the players cannot communicate, a mediator may be able to 
help by suggesting an equilibrium to all of them. They have no reason not to take 
the suggestion, and they would use the mediator even if his services were costly. 
Mediation in cases like this is as effective as arbitration, in which an outside party 
imposes a solution. 

One disadvantage of focal points is that they lead to inflexibility. Suppose the 
Pareto-superior equilibrium (Large, Large) was chosen as a focal point in Pure 
Coordination, but the game was repeated over a long interval of time. The numbers 
in the payoff matrix might slowly change until (Small,Small) and (Large,Large) both 
had payoffs of 1.5, and (Small,Small) might start to dominate. When, if ever, would 
the equilibrium switch? 

In Pure Coordination, we would expect that after some time one firm would 
switch and the other would follow. If there were communication, the switch point 
would be at the payoff of 1.5. But what if the first firm to switch is penalized more? 
Such is the problem in oligopoly pricing. If costs rise, so should the monopoly price, 
but whichever firm raises its price first suffers a loss of market share. 


Recommended Reading 


Bernheim, B. Douglas (1984a) “Rationalizable Strategic Behavior” Econometrica. 
July 1984. 52, 4: 1007-28. 

Schelling, Thomas (1960) The Strategy of Conflict. Cambridge, Mass.: Harvard 
University Press, 1960. 


Problem 1 
A Discoordination Game 
Suppose that a man and a woman each choose whether to go to a prize fight or a 


ballet. The man would rather go to the prize fight, and the woman to the ballet. 
What is more important to them, however, is that the man wants to show up to 


the same event as the woman, but the woman wants to avoid him. 


(1) Construct a game matrix to illustrate this game, choosing numbers to fit the 
preferences described verbally. 

(2) If the woman moves first, what will happen? 

(3) Does the game have a first-mover advantage? 

(4) Show that there is no Nash equilibrium if the players move simultaneously. 


You will discover how to find a Nash equilibrium in random strategies for games 
like these in Chapter 3. 
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Notes 
N1.1 Basic Definitions 


The standard descriptive names help both the modeller and his readers. For the mod- 


eller, the names are useful because they help ensure that the important details of the 


game have been fully specified. For his readers, they make the game easier to under- 
stand, especially if, as with most technical papers, the paper is first skimmed quickly 
to see if it is worth reading. The less clear a writer's style, the more closely he should 
adhere to the standard names, which means that most of us ought to adhere very closely 
indeed. 

Think of writing à paper as a game between author and reader, rather than as a single- 

player production process. The author, knowing that he has valuable information but 
imperfect means of communication, is trying to convey the information to the reader. 
The reader does not know whether the information is valuable, and he must choose 
whether to read the paper closely enough to find out. What are the possible equilibria? 
The equilibrium concepts that will be mentioned in this book, with the sections in 
which they are introduced, are dominant strategy (1.2), iterated dominant strategy (1.3), 
Nash (1.4, rationalizable (N1.4), Bayesian (2.5), correlated strategy (N3.3), subgame 
perfect (4.2), coalition-proof Nash (N4.2), minimax (N4.6), maximin (N4.6), perfect 
Bayesian (5.1), trembling hand perfect (5.1), sequential (5.1), proper (N5.2), divine 
(N5.2), intuitive (5.2), evolutionarily stable strategy (ESS) (5.5), reactive (8.5), and 
Wilson (8.5). A recent book on equilibrium concepts is Harsanyi & Selten (1988), which 
emphasizes bargaining games. 
In OPEC Model I, the notation “Qs.3 = Н” was used to denote "Saudi Arabian oil 
output in 1988 is High." A logically equivalent model uses the notation “Ху = 2" to 
denote “Country 175 oil output in Period 1 is 2.” Does it make any difference which 
notation is used? 


N1.2 Dominant Strategies: the Prisoner's Dilemma 


e The Prisoner's Dilemma was named by Albert Tucker in an unpublished paper, although 


the particular 2-by-2 matrix, discovered by Dresher and Flood, was already well-known. 
Tucker was asked to give a talk on game theory to the psychology department at Stan- 
ford, and invented a story to go with the matrix. Straffin (1980) tells this history. 
Herodotus describes an early example of the reasoning in the Prisoner's Dilemma in the 
conspiracy of Darius against the Persian emperor. A group of nobles met and decided 
to overthrow the emperor, and it was proposed to adjourn till another meeting. Darius 
then spoke up and said that if they adjourned, he knew that one of them would go 
straight to the emperor and fink on them, because if nobody else did, he would himself. 
Darius also suggested a solution—that they immediately go to the palace and kill the 
emperor. 

The conspiracy also illustrates a way out of coordination games. After killing the 

emperor, the nobles wished to select one of themselves as the new emperor. Rather than 
fight, they agreed to go to a certain hill at dawn, and whoever's horse neighed first would 
become emperor. Herodotus tells how Darius's groom manipulated this randomization 
scheme to make him the new emperor. 
Philosophers are intrigued by the Prisoner's Dilemma: see Campbell & Sowden (1985), 
a collection of articles on the Prisoner's Dilemma and the related Newcombe's paradox. 
Many economists are reluctant to use the concept of cardinal utility, and even more 
reluctant to compare utility across individuals (see Cooter & Rappoport [1984]). Non- 
cooperative game theory never requires interpersonal utility comparisons, and only ordi- 
nal utility is needed to find the equilibrium in the Prisoner's Dilemma. So long as each 
player's rank ordering of payoffs in different outcomes is preserved, the payoffs can be 
altered without changing the equilibrium. In general, the dominant strategy and pure 
strategy Nash equilibria of games depend only on the ordinal ranking of the payofts, 
but the mixed strategy equilibria depend on the cardinal values (see Section 3.2, and 
compare Chicken [Section 3.3] with the Hawk-Dove Game [Section 5.6]). 
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e If we consider only the ordinal ranking of the payoffs in 2-by-2 games, there are 78 dis- 
tinct games in which each player has a strict preference ordering over the four outcomes 
(listed and described in Rapoport & Guyer [1966]), and 726 distinct games (Guyer & 
Hamburger [1968]) if we allow ties in the payofis . 

e The Prisoner's Dilemma is not always defined the same way. If we consider just ordinal 
payoffs, then the game in Table 1.7 is a Prisoner's Dilemma Шс>а>а>Ь If the 
game is repeated, the cardinal values of the payoffs can be important. The requirement 
да > ђ + с > 2d should be added if the game is to be a standard Prisoner's Dilemma, 
in which (Cooperate, Cooperate) and (Fink, Fink) are the best and worst possible 
outcomes in terms of the sum of payoffs. Section 4.7 will show that an asymmetric 
game called the One-Sided Prisoner’s Dilemma has properties similar to the standard 
Prisoner’s Dilemma, but does not fit this definition. 

Sometimes the game in which 2a < b+ с is also called a Prisoner’s Dilemma, but then 
the sum of the players’ payoffs is maximized when one finks and the other cooperates. 
If the game were repeated, or they could use the correlated equilibria defined in Section 
3.3, they would prefer taking turns being finked against, which would make the game a 
coordination game similar to the Battle of the Sexes. David Shimko has suggested the 
name “Battle of the Prisoners” for this (which is more dignified than “Sex Prisoners’ 
Dilemma” ). 


Table 1.7 The Prisoner’s Dilemma with general payofis 
Column 


Cooperate Fink 


Cooperate a,a b,c 
Row 
Fink c,b d,d 


Payoffs to: (Row, Column) 


N1.3 Iterated Dominance: the Battle of the Bismarck Sea 


e The Battle of the Bismarck Sea can be found in Haywood (1954). 

e The 2-by-2 form with just four entries that could be used for the Battle of the Bismarck 
Sea and other zero-sum games is a matrix game, while the equivalent table with eight 
entries is a bimatrix game. Games can be represented as bimatrix games even if they 
have more than two moves, so long as the number is finite. 

e The dominant strategy equilibrium of any game is unique, if it exists. Each player has 
only one strategy (or none) whose payoff in any strategy combination is strictly higher 
than the payoff from any other strategy, so only one strategy combination can be formed 
out of dominant strategies. A game may have several weak dominant strategy equilibria, 
however, because a single player can have two or more weakly dominant strategies. 

An iterated strong dominant strategy equilibrium is unique, if it exists. An iterated 
weak dominant strategy equilibrium is not always unique, because the order in which 
strategies are deleted can matter to the final solution. If, however, all the weakly domi- 
nated strategies are eliminated simultaneously at each round of elimination, the iterated 
equilibrium is unique. 

e Ifa game is zero-sum the utilities of the players can be represented so as to sum to zero 
under any outcome. Since utility functions are to some extent arbitrary, the sum can 
also be represented to be non-zero even if the game is zero-sum. Often modellers will 
refer to a game as zero-sum even when the payoffs do not add up to zero, so long as the 
payoffs add up to some constant amount. 'T'he difference is a trivial normalization. 
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e If outcome X strongly Pareto-dominates outcome Y, then all players have higher 
utility under outcome X. If outcome X weakly Pareto-dominates outcome Y , some 
player has higher utility under X, and no player has lower utility. A zero-sum game does 
not have outcomes that even weakly Pareto-dominate other outcomes. All its equilibria 
are Pareto-efficient, because no player gains without another player losing. 


N1.4 Nash Equilibrium: Boxed Pigs, the Battle of the Sexes, and Pure 
Coordination 


е І invented the payoffs for Boxed Pigs Кош the description of one of the experiments in 
Baldwin & Meese (1979). They do not think of this as an experiment in game theory, 
and they describe the result in terms of “reinforcement.” I invented the Nash Puzzle 
and Pure Coordination for this book. The Battle of the Sexes is taken from p. 90 of 
Luce & Raiffa (1957). I have changed their payoffs of (-1,—1) to (—5,—5) to fit the 
story. 

e Some people prefer “equilibrium point” to “Nash equilibrium,” but the latter is conve- 
nient, since the name is “Nash” and not “Mazurkiewicz.” 

е Bernheim (19842) and Pearce (1984) use the idea of mutually consistent beliefs to arrive 
at a different equilibrium concept than Nash. They define a rationalizable strategy 
to be a strategy which is a best response for some set of rational beliefs in which a 
player believes that the other players choose their best responses. The difference from 
Nash is that not all players need have the same beliefs concerning which strategies will 
be chosen, nor need their beliefs be consistent. 

This idea is attractive in the context of Bertrand games (see Section 12.2). The Nash 
equilibrium in the Bertrand game is weakly dominated— by picking any other price above 
MC, which yields the same profit of zero as does the equilibrium. Rationalizability rules 
that out. 

е One of the best-known coordination problems is that of the QWERTY typewriter key- 
board, developed in the 1870s when typing had to proceed slowly to avoid jamming. 
QWERTY became the standard, although a US Navy study found in the 1940s that the 
faster speed possible with the Dvorak keyboard would amortize the cost of retraining 
full-time typists within ten days (David [1985]). Why large companies have not retrained 
their typists is a mystery. 

• А well-known discoordination game without conflict is the Highway Game. 10,000 
commuters decide each morning whether to drive on the San Diego Freeway or Sepul- 
veda Boulevard. A commuter’s payoff is higher if he chooses the road that has fewer 
drivers. What does he do? Discoordination games have no symmetric equilibria except 
in randomized strategies (see Section 3.2), but there are many asymmetric equilibria in 
which 5,000 commuters are on.each road. 

e Another well-known discoordination game is Matching Pennies. In this zero-sum, 
2-by-2 game, players Smith and Brown each choose Heads or Tails. If they choose the 
same action, Smith pays 10 to Brown. If they choose different actions, Brown pays 10 
to Smith. 

е О. Henry's story,“The Gift of the Magi" is about a coordination game that is note- 
worthy for the reason why communication is ruled out. The husband sells his watch 
to buy his wife combs for Christmas, while she sells her hair to buy him a watch fob. 
Communication would spoil the surprise, a worse outcome than discoordination. 

* Macroeconomics has more game theory in it than is readily apparent. The macroeco- 
nomic concept of rational ezpectations faces the same problems of multiple equilibria and 
consistency of expectations as Nash equilibrium. 

e Standard setting is an active topic of research at the time I write this. Recent examples 
are Katz & Shapiro (1985) and Farrell & Saloner (1985). 

е In Section 3.3 we return to problems of coordination to discuss the concepts of "corre- 
lated strategies" and “cheap talk." 
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IN1.5 Focal Points 


e Besides his 1960 book, Schelling has written books on diplomacy (1966) and the oddities 
of aggregation (1978). Political scientists are now looking at the same issues more 
technically; see Brams & Kilgour (1988) and Ordeshook (1986). 

• In Chapter 12 of The General Theory, Keynes suggests that the stock market is a game 
with multiple equilibria, like a contest in which a newspaper publishes the faces of 20 
girls, and contestants submit the name of the one they think most people would submit 
as the prettiest. When the focal point changes, big swings in prices result. 

e Not all of what we call boundaries have an arbitrary basis. If the Chinese cannot defend 
themselves as easily once the Russians cross the boundary at the Amur River, they have 
a clear reason to fight there. 
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Chapter 1 


Static Games of Complete 
Information 


In this chapter we consider games of the following simple form: 
first the players simultaneously choose actions; then the players 
receive payoffs that depend on the combination of actions just cho- 
sen. Within the class of such static (or simultaneous-move) games, 
we restrict attention to games of complete information. That is, each 
player's payoff function (the function that determines the player's 
payoff from the combination of actions chosen by the players) is 
common knowledge among all the players. We consider dynamic 
(or sequential-move) games in Chapters 2 and 4, and games of 
incomplete information (games in which some player is uncertain 
about another player's payoff function—as in an auction where 
each bidder's willingness to pay for the good being sold is un- 
known to the other bidders) in Chapters 3 and 4. 

In Section 1.1 we take a first pass at the two basic issues in 
game theory: how to describe a game and how to solve the re- 
sulting game-theoretic problem. We develop the tools we will use 
in analyzing static games of complete information, and also the 
foundations of the theory we will use to analyze richer games in 
later chapters. We define the normal-form representation of a game 
and the notion of a strictly dominated strategy. We show that some 
games can be solved by applying the idea that rational players 
do not play strictly dominated strategies, but also that in other 
games this approach produces a very imprecise prediction about 
the play of the game (sometimes as imprecise as "anything could 
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happen”). We then motivate and define Nash equilibrium—a so- 
lution concept that produces much tighter predictions in a very 
broad class of games, 

In Section 1.2 we analyze four applications, using the tools 
developed in the previous section: Cournot's (1838) model of im- 
perfect competition, Bertrand's (1883) model of imperfect com- 
petition, Farber's (1980) model of final-offer arbitration, and the 
problem of the commons (discussed by Hume [1739] and others). 
In each application we first translate an informal statement of the 
problem into a normal-form representation of the game and then 
solve for the game's Nash equilibrium. (Each of these applications 
vas a unique Nash equilibrium, but we discuss examples in which 
his is not true.) 

In Section 1.3 we return to theory. We first define the no- 
ion of a mixed strategy, which we wiil interpret in terms of one 
layer’s uncertainty about what another player will do. We then 
tate and discuss Nash's (1950) Theorem, which guarantees that a 
Vash equilibrium (possibly involving mixed strategies) exists in a 
road class of games. Since we present first basic theory in Sec- 
ion 1.1, then applications in Section 1.2, and finally more theory 
1 Section 1,3, it should be apparent that mastering the additional 
1eory in Section 1.3 is not a prerequisite for understanding the 
plications in Section 1.2. On the other hand, the ideas of a mixed 
‘rategy and the existence of equilibrium do appear (occasionally) 
! later chapters. 

This and each subsequent chapter concludes with problems, 
1ggestions for further reading, and references. 


1 Basic Theory: Normal-Form Games and Nash 
Equilibrium 


LA Normal-Form Representation of Games 


the normal-form representation of a game, each player simul- 
neously chooses a strategy, and the combination of strategies 
osen by the players determines a payoff for each player, We 
1strate the normal-form representation with a classic example 
The Prisoners' Dilemma. Two suspects are arrested and charged 
th a crime. The police lack sufficient evidence to convict the sus- 
cts, unless at least one confesses. The police hold the suspects in 
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ate cells and explain the consequences that will follow from 
ctions they could take. If neither confesses then both will be 
icted of a minor offense and sentenced to one month in jail. 
th confess then both will be sentenced to jail for six months. 
ly, if one confesses but the other does not, then the confes- 
vill be released immediately but the other will be sentenced 
ne months in jail—six for the crime and a further three for 
‘ting justice. 
epis жеш problem can be represented in the accompany- 
-matrix. (Like a matrix, a bi-matrix can have an arbitrary 
ber or rows and columns; "bi" refers to the fact that, in a 
player game, there are two numbers in each cell—the payoffs 


e two players.) 


Prisoner 2 


Prisoner 1 





The Prisoners’ Dilemma 


is game, each player has two strategies available: confess 
in and not ar (or be mum). The payoffs to the two 
Ars when a particular pair of strategies is chosen аге piven A 
ppropriate cell of the bi-matrix. By convention, the P о 
o-called row player (here, Prisoner 1) is the first payoff g bs 
wed by the payoff to the column player (here, 5. i 
‚‚ if Prisoner 1 chooses Mum and Prisoner 2 chooses слеті 
iple, then Prisoner 1 receives the payoff —9 (representing к пе 
ths in jail) and Prisoner 2 receives the payoff 0 (representing 
iate release). | 
^ vane, the general case. The normal-form 2. 
game specifies: (1) the players in the game, (2) the oo es 
able to each player, and (3) the payoff received by each p pes 
'ach combination of strategies that could be chosen by the 
ers. We will often discuss an n-player game in which ше 
ers are numbered from 1 to n and an arbitrary player is calle | 
er i, Let S; denote the set of strategies available to жедің i 
^d i's strategy space), and let s; denote an arbitrary member е 
set. (We will occasionally write s; € 5; to indicate that the 
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strategy s; is a member of the set of strategies Si.) Let (51,.., ,s,) 
denote a combination of Strategies, one for each player, and let 
щ denote player i's payoff function: щ(51,...,5н) is the payoff to 
player i if the players choose the strategies (s),...,s,), Collecting 
all of this information together, we have: 


Definition The normal-forim representation ofan n-player game spec- 
ifies the players’ strategy Spaces 51,...,5, and their payoff functions 
Шүу... We denote this game by С = (51,...,5,; Hic. My]. 


Although we stated that in a normal-form game the players 
choose their strategies simulta neously, this does not imply that the 
parties necessarily act simultaneously: it suffices that each choose 
his or her action without knowledge of the others’ choices, as 
would be the case here if the prisoners reached decisions at ar- 
bitrary times while in their separate cells, Furthermore, although 
in this chapter we use normal-form games to represent only static 
games in which the players all move without knowing the other 
players’ choices, we will see in Chapter 2 that normal-form repre- 
sentations can be given for sequential-move games, but also that 
an alternative—the extensive-form representation of the game—is 
often a more convenient framework for analyzing dynamic issues, 


LLB Iterated Elimination of Strictly Dominated 
Strategies 


Having described one way to represent a game, we now take a 
first pass at describing how to solve a game-theoretic problem. 
We start with the Prisoners’ Dilemma because it is easy to solve, 
using only the idea that a rational player will not play a strictly 
dominated strategy. 

In the Prisoners’ Dilemma, if one suspect is going to play Fink, 
then the other would prefer to play Fink and so be in jail for six 
months rather than play Mum and so be in jail for nine months, 
Similarly, if one suspect is going to play Mum, then the other 
would prefer to play Fink and so be released immediately rather 
than play Mum and so be in jail for one month. Thus, for prisoner 
i, playing Mum is dominated by playing Fink—for each strategy 
that prisoner j could choose, the payoff to prisoner i from playing 
Mum is less than the payoff to і from playing Fink. (The same 
would be true in any bi-matrix in which the payoffs 0, —1, —6, 
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апа -9 above were replaced with payoffs Т, R, P, and 5, геврес- 
tively, provided that T > R > P > S so as to capture the ideas 
of temptation, reward, punishment, and sucker payoffs.) More 
generally: 
Definition In the normal-form game С = (51,... Sui Ut s Hg], lel 
s; and s;' be feasible strategies for player i (i.e., Si and sj are members of 
5j. Strategy s; is strictly dominated by strategy sj! if for each feasible 
combination of the other players’ strategies, i's payoff from playing s; is 
strictly less than i's payoff from playing si’: 

(Зуи 5.1, Si Sithara Su) € Hist... 59-575... Su) (DS) 


for each (s,,...,Si—1,Si44,+++.5n) that can be constructed from the other 
players’ strategy spaces Syy ses 1. а] и би 


Rational players do not play strictly dominated strategies, be- 
asd there is e belief that à йук. could hold (about the strate- 
gies the other players will choose) such that it would be optimal 
to play such a strategy.' Thus, in the Prisoners’ Dilemma, a ratio- 
nal player will choose Fink, só (Fink, Fink) will be the outcome 
reached by two rational players, even though (Fink, Fink) results 
in worse payoffs for both players than would (Mum, Mum). Be- 
cause the Prisoners’ Dilemma has many applications (including 
the arms race and the free-rider problem in the provision of pub- 
lic goods), we will return to variants of the game in Chapters 2 
and 4. For now, we focus instead on whether the idea that rational 
players do not play strictly dominated strategies can lead to the 
solution of other games, 

Consider the abstract game in Figure 1.1.1.2 Player 1 has two 
strategies and player 2 has three: S; = (Up, Down} and 5; = 
{Left, Middle, Right}. For player 1, neither Up nor Down is strictly 


"А complementary question is also of interest: if there is no belief that player i 
could hold (about the азер the other players will choose) such that it would 
be optimal to play the strategy s;, can we conclude that there must be another 
strategy that strictly dominates 5/7 The answer is "yes, provided that e f 
appropriate definitions of "belief" and “another strategy,” both of which involve 
the Idea of mixed strategies to be introduced in Section 1.3.A. 

"Most of this book considers economic applications rather than abstract exam- 
ples, both because the applications are of interest in their own right and because, 
for many readers, the applications are often a useful way to explain the under- 
lying theory. When introducing some of the basic theoretical ideas, however, 
we will somelimes resort to abstract examples that have no natural economic 
interpretation. 
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Player 2 
Left Middle ^ Right 





Figure 1.1.1. 


dominated: Up is better than Down if 2 play 

| 2 plays Left (because 1 > 0), 
but Down is better than Up if 2 plays Right (because 2 > 0). Ror 
player 2, however, Right is strictly dominated by Middle (because 


2 > Тапа 1 > 0), so a rational layer 2 will not pl 

Thus, if player 1 knows that player 2 is rational then genet ді 
eliminate Right from player 2's strategy space. That is, if player 
І knows that player 2 is rational then player 1 can play the game 
in Figure 1.1.1 as if it were the game in Figure 1.1.2, " 


А Player 2 
Left Middle 


Player 1 





Figure 1.1.2. 


In Figure 1.1.2, Down is now strict] dominated by 1 
player 1, 80 if player 1 is rational (and тад 1 knows whe Hs йы 
is rational, so that the game in Figure 1.1.2 applies) then player 1 
will not play Down. Thus, if player 2 knows that player 1 is ra- 
tional, and player 2 knows that player 1 knows that player 2 is 
rational (so that player 2 knows that Figure 1.1.2 applies), then 
player 2 can eliminate Down from player 1's strategy space, leav- 
we game in Figure 1.1.3. But now Left is strictly dominated 
сй : cem for player 2, leaving (Up, Middle) as the outcome of 

This process is called iterated elimination of strictly domi 
strategies. Although it is based on the мене, ang mee 
nal players do not play strictly dominated Strategies, the process 
has two drawbacks, First, each step requires a further assumption 
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Player 2 
Left Middle 


Figure 1.1.3. 


jut what the players know about each other's rationality, If 
' want to be able to apply the process for an arbitrary number 
sleps, we need to assume that it is common knowledge that the 
iyers are rational. That is, we need to assume not only that all 
players are rational, but also that all the players know that all 
: players are rational, and that all the players know that all the 
iyers know that all the players are rational, and so on, ad in- 
itum. (See Aumann [1976] for the formal definition of common 
owledge.) 

The second drawback of iterated elimination of strictly domi- 
ed strategies is that the process often produces a very impre- 
2 prediction about the play of the game. Consider the game in 
jure 1.1.4, for example. In this game there are no strictly dom- 
ted strategies to be eliminated. (Since we have not motivated 
s game in the slightest, it may appear arbitrary, or even patho- 
‘ical. See the case of three or more firms in the Cournot model 
Section 1.2.A for an economic application in the same spirit.) 
се all the strategies in the game survive iterated elimination of 
ictly dominated strategies, the process produces no prediction 
iatsoever about the play of the game. 





Figure 1.1.4. 


We turn next to Nash equilibrium—a solution concept that 
)duces much tighter predictions in a very broad class of games. 
' show that Nash equilibrium is a stronger solution concept 
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than iterated elimination of strictly dominated Strategies, in the 
sense that the players’ strategies in a Nash equilibrium always 
survive iterated elimination of strictly dominated strategies, but 
the converse is not true. In subsequent chapters we will argue that 
in richer games even Nash equilibrium produces too imprecise a 
prediction about the play of the game, so we will define still- 
stronger notions of equilibrium that ate better suited for these 
richer games. 


1.1.C Motivation and Definition of Nash Equilibrium 


One way to motivate the definition of Nash equilibrium is to argue 
that if game theory is to provide a unique solution to a game- 
theoretic problem then the solution must be a Nash equilibrium, 
in the following sense. Suppose that game theory makes a unique 
prediction about the strategy each player will choose. In order 
for this prediction to be correct, it is necessary that each player be 
willing to choose the strategy predicted by the theory. Thus, each 
player's predicted strategy must be that player's best response 
to the predicted strategies of the other players. Such a prediction 
could be called strategically stable or self-enforcing, because no sin gle 
player wants to deviate from his or her predicted strategy. We will 
call such a prediction a Nash equilibrium: 


Definition In tlie n-player normal-form game С = (51,... Зи t,..., 
Ни}, the strategies (s},....85) area Nash equilibrium if, for each player 
і, Sj is (at least tied for) player i's best response to the strategies specified 
for the n — 1 other players, (s},... T HEC NEN T 


ТІС . 4 8 Sj pH Sp Seq, ва Sp) 
> Ш(51,... Sj 1h Si que * 454) (МЕ) 


for every feasible strategy s; in Sj; that is, s? solves 


max tij(si,..., 5j. Si, Spp i$.) 
5165; 


To relate this definition to its motivation, suppose game theory 
offers the strategies (s},...,s/,) as the solution to the normal-form 
game С = (51,...,5,; и)... Un}. Saying that (sj,...,s/) is not 
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a Nash equilibrium of G is equivalent to saying that there exists 
some player i such that 5; is not a best response to (s1,...,5;. 1,511, 
..+,8},). That is, there exists some s/’ in S; such that 


i i ff i , J HH LJ ! 
Ш/651, EMO Sij Sj 5341» ea ,5,) © ні(57, à à à 194.13 54 541%: . ,5,)- 


Thus, if the theory offers the strategies (51,...,5)) as the solution 
but these strategies are not a Nash equilibrium, then at least one 
player will have an incentive to deviate from the theory's predic- 
tion, so the theory will be falsified by the actual play of the game. 
A closely related motivation for Nash equilibrium involves the 
idea of convention: if a convention is to develop about how to 
play a given game then the strategies prescribed by the conven- 
tion must be a Nash equilibrium, else at least one player will not 
abide by the convention. 

To be more concrete, we now solve a few examples. Consider 
the three normal-form games already described—the Prisoners' 
Dilemma and Figures 1.1.1 and 1.1.4. A brute-force approach to 
finding a game's Nash equilibria is simply to check whether each 
possible combination of strategies satisfies condition (NE) in the 
definition? In a two-player game, this approach begins as follows: 
for each player, and for each feasible strategy for that player, deter- 
mine the other player's best response to that strategy. Figure 1.1.5 
does this for the game in Figure 1.1.4 by underlining the payoff 
to player j's best response to each of player Гз feasible strategies. 
If the column player were to play L, for instance, then the row 
player's best response would be M, since 4 exceeds 3 and 0, so 
the row player's payoff of 4 in the (M, L) cell of the bi-matrix is 
underlined. 

A pair of strategies satisfies condition (NE) if each player's 
strategy is a best response to the other's—that is, if both pay- 
offs are underlined in the corresponding cell of the bi-matrix. 
Thus, (B, R) is the only strategy pair that satisfies (NE); likewise 
for (Fink, Fink) in the Prisoners’ Dilemma and (Up, Middle) in 


Ут Section 1.3.4 we will distinguish between pure and mixed strategies. We 
will then see that the definition given here describes pure-strategy Nash equilibria, 
but that there can also be mixed-strategy Nash equilibria. Unless explicitly noted 
otherwise, all references to Nash equilibria in this section are to pure-strategy 
Nash equilibria. 
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Figure 1.1.5. 


Figure 1.1.1. These strategy pairs are the unique Nash equilibria 
of these games. 

We next address the relation between Nash equilibrium and 
iterated elimination of strictly dominated strategies. Recall that 
the Nash equilibrium strategies in the Prisoners' Dilemma and 
Figure 1.1.1—(Fink, Fink) and (Up, Middle), respectively—are the 
only strategies that survive iterated elimination of strictly domi- 
nated strategies. This result can be generalized: if iterated elimina- 
tion of strictly dominated strategies eliminates all but the strategies 
61».-54), then these strategies are the unique Nash equilibrium of 
the game. (See Appendix 1.1.C for a proof of this claim.) Since it- 
erated elimination of strictly dominated strategies frequently does 
no! eliminate all but a single combination of strategies, however, 
it is of more interest that Nash equilibrium is a stronger solution 
concept than iterated elimination of strictly dominated strategies, 
in the following sense. If the strategies (51,...,51) are a Nash equi- 
librium then they survive iterated elimination of strictly domi- 
nated strategies (again, see the Appendix for a proof), but there 
can be strategies that survive iterated elimination of strictly dom- 
inated strategies but are not part of any Nash equilibrium. To see 
the latter, recall that in Figure 1.1.4 Nash equilibrium gives the 
unique prediction (B, R), whereas iterated elimination of strictly 
dominated strategies gives the maximally imprecise prediction: no 
strategies are eliminated; anything could happen. 

Having shown that Nash equilibrium is a stronger solution 
concept than iterated elimination of strictly dominated strategies, 
we must now ask whether Nash equilibrium is too strong a so- 
lution concept. That is, can we be sure that a Nash equilibrium 


сата”. ae 

‘This statement is correct even if we do not restrict attention to pure-strategy 
Nash equilibrium, because no mixed-strategy Nash equilibria exist in these three 
games, See Problem 1.10. 
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exists? Nash (1950) showed that in any finite game (i.e., a game in 
which the number of players n and the strategy sets 51,..., S, are 
all finite) there exists at least one Nash equilibrium. (This equi- 
librium may involve mixed strategies, which we will discuss in 
Section 1.3.A; see Section 1.3.B for a precise statement of Nash's 
Theorem.) Cournot (1838) proposed the same notion of equilib- 
rium in the context of a particular model of duopoly and demon- 
strated (by construction) that an equilibrium exists in that model; 
see Section 1.2.A. In every application analyzed in this book, we 
will follow Cournot's lead: we will demonstrate that a Nash (or 
stronger) equilibrium exists by constructing one. In some of the 
theoretical sections, however, we will rely on Nash's Theorem (or 
its analog for stronger equilibrium concepts) and simply assert 
that an equilibrium exists. 

We conclude this section with another classic example—The 
Battle of the Sexes. This example shows that a game can have mul- 
tiple Nash equilibria, and also will be useful in the discussions of 
mixed strategies in Sections 1.3.B and 3.2.A. In the traditional ex- 
position of the game (which, it will.be clear, dates from the 1950s), 
a man and a woman are trying to decide on an evening’s enter- 
tainment; we analyze a gender-neutral version of the game. While 
at separate workplaces, Pat and Chris must choose to attend either 
the opera or a prize fight. Both players would rather spend the 
evening together than apart, but Pat would rather they be together 
at the prize fight while Chris would rather they be together at the 
opera, as represented in the accompanying bi-matrix. 


Pat 
Opera Fight 
Opera 


The Battle of the Sexes 


Both (Opera, Opera) and (Fight, Fight) are Nash equilibria. 

We argued above that if game theory is to provide a unique 
solution to a game then the solution must be a Nash equilibrium. 
This argument ignores the possibility of games in which game 
theory does not provide a unique solution. We also argued that 
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if a convention is to develop about how to play a given game, 
then the strategies prescribed by the convention must be a Nash 
equilibrium, but this argument similarly ignores the possibility of 
games for which a convention will not develop. In some games 
with multiple Nash equilibria one equilibrium stands out as the 
compelling solution to the game. (Much of the theory in later 
chapters is an effort to identify such a compelling equilibrium 
in different classes of games.) Thus, the existence of multiple 
Nash equilibria is not a problem in and of itself, In the Battle 
of the Sexes, however, (Opera, Opera) and (Fight, Fight) seem 
equally compelling, which suggests that there may be games for 
which game theory does not provide a unique solution and no 
convention will develop.’ In such games, Nash equilibrium loses 
much of its appeal as a prediction of play. 


Appendix 1.1.C 


This appendix contains proofs of the following two Propositions, 
which were stated informally in Section 1.1.C. Skipping these 
proofs will not substantially hamper one's understanding of later 
material. For readers not accustomed to manipulating formal def- 
initions and constructing proofs, however, mastering these proofs 
will be a valuable exercise, 


Proposition А In the n-player normal-form game С = (51,...,5,; 
Ut, «s Un), if iterated elimination of strictly dominated strategies elimi- 
nates all but the strategies (st, . .. ,5), then these strategies are the unique 
Nash equilibrium of the game. 


Proposition B In the n-player normal-form game С = (5),...,5,; 
пу... Un}, if the strategies (51,...,5%) area Nash equilibrium, then they 
survive iterated elimination of strictly dominated stra legies. 


*In Section 1.3.B we describe a third Nash equilibrium of the Battle of the 
Sexes (involving mixed strategles). Unlike (Opera, Opera) and (Fight, Fight), this 
third equilibrium has symmetric payoffs, as one might expect from the unique 
solution to a symmetric game; on the other hand, the third equilibrium is also 
inefficient, which may work against its development as a convention, Whatever 
one's judgment about the Nash equilibria in the Battle of the Sexes, however, 
the broader point remains: there may be games in which game theory does not 
provide a unique solution and no convention will develop. 
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Since Proposition В is simpler to prove, we begin with it, to 
warm up. The argument is by contradiction. That is, we will as- 
sume that one of the strategies in a Nash equilibrium is eliminated 
by iterated elimination of strictly dominated strategies, and then 
we will show that a contradiction would result if this assumption 
were true, thereby proving that the assumption must be false. 

Suppose that the strategies (51,...,81) аге a Nash equilibrium 
of the normal-form game С = (51,...,,4;14,..., ин}, but suppose 
also that (perhaps after some strategies other than (sf,...,s*) have 
been eliminated) s? is the first of the strategies (81,...,5%) to be 
eliminated for being strictly dominated. Then there must exist a 
strategy sj that has not yet been eliminated from S; that strictly 
dominates 5;. Adapting (05), we have 


Milte ++ 58h) Sf 5141) • + • у Ви) 
ў А 
< (81... 84-118; ЗН, +•• Зи) (1.1.1) 


for each (51,...,5j 1,541, ..., $4) that can be constructed from the 
strategies that have not yet been eliminated from the other players' 
strategy spaces. Since 5} is the first of the equilibrium strategies to 
be eliminated, the other players' equilibrium strategies have not 
yet been eliminated, so one of the implications of (1.1.1) is 


ui(si,. .. ЕТЕНЕ ж 54) 
< и ($1 ,... 51,87 „Бр • • - 52). (1.1.2) 


But (1.1.2) is contradicted by (МЕ): s? must be a best response to 
(Siset Sf- Siht Sn) 80 there cannot exist a strategy s/ that 
strictly dominates г. This contradiction completes the proof. 

Having proved Proposition B, we have already proved part of 
Proposition A: all we need to show is that if iterated elimination 
of dominated strategies eliminates all but the strategies (51,...,5;) 
then these strategies are a Nash equilibrium; by Proposition B, any 
other Nash equilibria would also have survived, so this equilib- 
rium must be unique. We assume that G is finite. 

The argument is again by contradiction. Suppose that iterated 
elimination of dominated strategies eliminates all but the strategies 
(51,...,51) but these strategies are not a Nash equilibrium. Then 
there must exist some player i and some feasible strategy s; in S; 
such that (NE) fails, but s; must have been strictly dominated by 


· some other strategy 5/ at some stage of the process. The formal 
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— of these two observations are: there exists s; in S; such 
that 


(2% 
Misi. 5874, 87, Sf44,...,8*) | 
< uj(s;,. * а Si. p Si Ste .. 5-7 (1.1,3) 


and there exists 5! in the set of player i's strategies remaining at 
some stage of the process such that 


Hi(81, +++ $11, 8), Эт, +. Sn) 
/ 
< н{(581,...,5—1,5],8р-1,...,5ң) (1.1.4) 


for each (s;,... › 51-1, Sit 8n) that can be constructed from the 
strategies remaining in the other players' strategy spaces at that 
stage of the process. Since the other players’ strategies (st... ,s? 


1-1: 


Si411+++)8,) аге never eliminated, one of the implications of (1.1.4) 

is | 

‘ ш(51,... Sj Sh Зр ++ S 81) 
<и(51,...,51 1,51, 81, +, 87). (1.1.5) 


If s; = s? (that is, if s; is the strategy that strictly dominates si) then 
(1.1.5) contradicts (1.1.3), in which case the proof is complete, If 
$$ 8; then some other strategy s? must later strictly dominate 57, 
since 5; does not survive the process, Thus, inequalities analogous 
to (1.1.4) and (1.1.5) hold with s; and s/' replacing s; and si, respec- 
tively. Once again, if 5} = s? then the proof is complete; otherwise, 
two more analogous inequalities can be constructed. Since s? is 
the only strategy from S; to survive the process, repeating this 
argument (in a finite game) eventually completes the proof. 


1.2 Applications 
12.A Cournot Model of Duopoly 


As noted in the previous section, Cournot (1838) anticipated Nash's 
definition of equilibrium by over a century (but only in the con- 
text of a particular model of duopoly). Not surprisingly, Cournot's 
work is one of the classics of game theory; it is also one of the cor- 
nerstones of the theory of industrial organization. We consider à 
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very simple version of Cournot's model here, and return to vari- 
ations on the model in each subsequent chapter, In this section 
we use the model to illustrate: (a) the translation of an informal 
statement of a problem into a normal-form representation of a 
game; (b) the computations involved in solving for the game's 
Nash equilibrium; and (c) iterated elimination of strictly domi- 
nated strategies. 

Let 11 and 4; denote the quantities (of a homogeneous product) 
produced by firms 1 and 2, respectively. Let P(Q) — a — Q be the 
market-clearing price when the aggregate quantity on the market 
is Q = ф +42. (More precisely, P(Q) = a — Q for Q < a, and 
P(Q) = 0 for О > a) Assume that the total cost to firm i of 
producing quantity q; is C;(q;) = с. That is, there are no fixed 
costs and the marginal cost is constant at c, where we assume 
с <a. Following Cournot, suppose that the firms choose their 
quantities simultaneously.® 

In order to find the Nash equilibrium of the Cournot game, 
we first translate the problem into a normal-form game. Recall 
from the previous section that the normal-form representation of 
a game specifies: (1) the players in the game, (2) the strategies 
available to each player, and (3) the payoff received by each player 
for each combination of strategies that could be chosen by the 
players. There are of course two players in any duopoly game— 
the two firms. In the Cournot model, the strategies available to 
each firm are the different quantities it might produce. We will 
assume that output is continuously divisible. Naturally, negative 
outputs are not feasible. Thus, each firm’s strategy space can be 
represented as S, = [0, со), the nonnegative real numbers, in which 
case a typical strategy s; is a quantity choice, 4; > 0. One could 
argue that extremely large quantities are not feasible and so should 
not be included in a firm's strategy space. Because P(Q) — 0 for 
О > a, however, neither firm will produce a quantity д; > a. 

It remains to specify the payoff to firm i as a function of the 
strategies chosen by it and by the other firm, and to define and 


"We discuss Bertrand's (1883) model, in which firms choose prices rather than 
quantities, in Section 1.2.B, and Stackelberg's (1934) model, in which firms choose 
quantities but one firm chooses before (and is observed by) the other, in Sec- 
tion 2.1.B. Finally, we discuss Friedman's (1971) model, in which the interaction 
described in Cournot's model occurs repeatedly over time, in Section 2.3.C. 
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solve for equilibrium. We assume that the firm's payoff is simply 
its profit. Thus, the payoff uj(si,s;) in a general two-player game 
in normal form can be written here as” 


ri(qi dj) = qilP(qi + 2j) — c] = qila — (qi + 9) – c]. 


Recall from the previous section that in a two-player game in nor- 
mal form, the strategy pair (81,5%) is a Nash equilibrium if, for 
each player i, 

ui(S; Si) > ui(si, 57) (NE) 
for every feasible strategy s; in S;. Equivalently, for each player i, 
5? must solve the optimization problem 


max Nilss’). 
5/65; il i у) 


In the Cournot duopoly model, the analogous statement is that 
the quantity pair (41,42) is a Nash equilibrium if, for each firm i, 
q? solves 


вде. AMG) = Prax ale — (4:47) d. 


Assuming qj < а— c (as will be shown to be true), the first-order 
condition for firm 8 optimization problem is both necessary and 
sufficient; it yields 


1 А 
qi= 5(a - qj - 9). (1.2.1) 
Thus, if the quantity pair (41,95) is to be a Nash equilibrium, the 
firms’ quantity choices must satisfy 
* 1 + 
ai = 500-95 -c) 
and 
ы 1 * 
= 2 (1 — 9 -c) 


"Note that we have changed the notation slightly by writing (s; sj) rather 
than 1;(s;,82). Both expressions represent the payoff to player í as a function of 
the strategies chosen by all the players. We will use these expressions (and their 
- player analogs) interchangeably. | 
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Solving this pair of equations yields 


вж 86 
Ч = 92 = 3 Ж. 


which is indeed less than a — c, as assumed. 

The intuition behind this equilibrium is simple. Each firm 
would of course like to be a monopolist in this market, in which 
case it would choose 4; to maximize 7;(g;,0)—it would produce 
the monopoly quantity ди = (а — c)/2 and earn the monopoly 
profit ли(ди, 0) = (a — c)?/4. Given that there are two firms, aggre- 
gate profits for the duopoly would be maximized by setting the 
aggregate quantity q; + 92 equal to the monopoly quantity дн, as 
would occur if q; = 4н/2 for each i, for example. The problem 
with this arrangement is that each firm has an incentive to devi- 
ate: because the monopoly quantity is low, the associated price 
P(qm) is high, and at this price each firm would like to increase its 
quantity, in spite of the fact that such an increase in production 
drives down the market-clearing price. (To see this formally, use 
(1.2.1) to check that 9/2 is not firm 2's best response to the choice 
of qm/2 by firm 1.) In the Cournot equilibrium, in contrast, the ag- 
gregate quantity is higher, so the associated price is lower, so the 
temptation to increase output is reduced—reduced by just enough 
that each firm is just deterred from increasing its output by the 
realization that the market-clearing price will fall. See Problem 1.4 
for an analysis of how the presence of n oligopolists affects this 
equilibrium trade-off between the temptation to increase output 
and the reluctance to reduce the market-clearing price. 

Rather than solving for the Nash equilibrium in the Cournot 
game algebraically, one could instead proceed graphically, as fol- 
lows. Equation (1.2.1) gives firm i's best response to firm j's 
equilibrium strategy, qj. Analogous reasoning leads to firm 2's 
best response to an arbitrary strategy by firm 1 and firm 1's best 
response to an arbitrary strategy by firm 2. Assuming that firm 1's 
strategy satisfies 41 < a — c, firm 2's best response is 


1 
К2 (91) = 5(« 4 — с); 


likewise, if 92 « a — c then firm 1's best response is 


Ri(q2) = (а — ф – с). 
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4; 









(0,a —c) 
Ку) 


(0,(a — с) /2) (0,9) 





((a = c)/2,0) (a Ес с,0) qi 


Figure 1.2.1. 


As shown in Figure 1.2.1, these two best-response functions inter- 
sect only once, at the equilibrium quantity pair (41,92). 

A third way to solve for this Nash equilibrium is to apply 
the process of iterated elimination of strictly dominated strategies. 
This process yields a unique solution—which, by Proposition A 
in Appendix 1.1.C, must be the Nash equilibrium (97,93). The 
complete process requires an infinite number of steps, each of 
which eliminates a fraction of the quantities remaining in each 
firm's strategy space; we discuss only the first two steps. First, the 
monopoly quantity ду = (a — c)/2 strictly dominates any higher 
quantity. That is, for any x > 0, Tilam, dj) > (Gm + х, qj) for all 
qj 2 0. To see this, note that if Q = ду +x + 4) < а, then 


ü—c|in-—c 
кеген 
апа 
=€ a-c 
пд + x, qi) = |" + | Е Sa a = ті(4ш,4)) = =2 4j), 


and НО = qm +x + ду > а, then P(Q) = 0, so producing a smaller 
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quantity raises profit. Second, given that quantities exceeding йт 
have been eliminated, the quantity (a — c)/4 strictly dominates 
any lower quantity. That is, for any x between zero and (a—c)/4, 
т{|(а — с) /4,9;] > m[(a — c)/4 — x, qj) for all 4) between zero and 
(a — c)/2. To see this, note that 


(< )- = 2—29 | 
SVE a | # 4 





апа 





a—c a-c 3(a — c) 
а) = st d pertes 


a—c 
Ti(qm; qj) -xX Ee +x- 7) : 


| 


After these two steps, the quantities remaining in each firm's 
strategy space are those in the interval between (a — c)/4 and 
(a — c)/2. Repeating these arguments leads to ever-smaller inter- 
vals of remaining quantities. In the limit, these intervals converge 
to the single point q? — (a — c)/3. 

Iterated elimination of strictly dominated Strategies can also be 
described graphically, by using the observation (from footnote 1; 
see also the discussion in Section 1.3.А) that a Strategy is strictly 
dominated if and only if there is no belief about the other players’ 
choices for which the strategy is a best response. Since there are 
only two firms in this model, we can restate this observation as: 
a quantity q; is strictly dominated if and only if there is no belief 
about qj such that q; is firm i's best response. We again discuss only 
the first two steps of the iterative process. First, it is never a best 


response for firm i to produce more than the'monopoly quantity, ' 


Gm = (a—c)/2. To see this, consider firm 2's best-response function, 
for example: in Figure 1.2.1, Ro(q1) equals ду when 91 = 0 and 
declines as qı increases. Thus, for апу qj > 0, if firm i believes 
that firm j will choose qj, then firm i's best response is less than or 
equal to qm; there is no qj such that firm i's best response exceeds 
qm. Second, given this upper bound on firm j's quantity, we can 
derive a lower bound on firm i's best response: if gj < (a — c)/2, 
then Rj(q;) > (a — c)/4, as shown for firm 2's best response in 
Figure 1.2.2.8 


. "These two arguments are slightly incomplete because we have not analyzed 
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4; 








(0,(a -с)/2)) 


(0,(a — с)/4) | К, (4) ) 





(@-с)/2,0) (а-с,0) ы 


Figure 1.2.2. 


As before, repeating these arguments leads to the single quantity 
q? = (a — c)/3. 

' We conclude this section by changing the Cournot model so 
that iterated elimination of strictly dominated strategies does not 
yield a unique solution. To do this, we simply add one or more 
firms to the existing duopoly. We will see that the first of the 
two steps discussed in the duopoly case continues to hold, but 
that the process ends there. Thus, when there are more than two 
firms, iterated elimination of strictly dominated strategies yields 
only the imprecise prediction that each firm's quantity will not 
exceed the monopoly quantity (much as in Figure 1.1.4, where no 
strategies were eliminated by this process). 

For concreteness, we consider the three-firm case. Let Q.; 
denote the sum of the quantities chosen by the firms other than 
i, and let ví(qi, Qi) = qila — qi — О; — c) provided q; + О; <a 
(whereas r;(qi, Q_;) = —cqi if qi +Q; > a). It is again true that the 
monopoly quantity qm = (a — c)/2 strictly dominates any higher 
quantity. That is, for any x > 0, ли, 0-1) > (qm + x, Qi) for 
all Q-; > 0, just as in the first step in the duopoly case. Since 





firm i's best response when firm i is uncertain about ду. Suppose firm i is uncertain 
about q but believes that the expected value of q; is E(qj). Because 7,(q/,q;) is 
linear in qj, firm i's best response when it is uncertain in this way simply equals 
its best response when it is certain that firm j will choose E(qj)—a case covered 
in the text. 
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there are two firms other than firm i, however, all we can say 
about Q_; is that it is between zero and a — c, because qj апа д 
are between zero and (а — с)/2. But this implies that no quantity 
qi 2 0 is strictly dominated for firm i, because for each 4; between 
zero and (a — c)/2 there exists a value of Q_; between zero and 
a—c (namely, Q_; = a—c—2gq;) such that а) is firm i's best response 
to Q_;. Thus, no further strategies can be eliminated. 


L2.B Bertrand Model of Duopoly 


We next consider a different model of how two duopolists might 
interact, based on Bertrand's (1883) suggestion that firms actu- 
ally choose prices, rather than quantities as in Cournot's model. 
It is important to note that Bertrand's model is a different game 
than Cournot's model: the strategy spaces are different, the pay- 
off functions are different, and (as will become clear) the behavior 
in the Nash equilibria of the two models is different. Some au- 
thors summarize these differences by referring to the Cournot and 
Bertrand equilibria. Such usage may be misleading: it refers to the 
difference between the Cournot and Bertrand games, and to the 
difference between the equilibrium behavior in these games, nol 
to a difference in the equilibrium concept used in the games. In 
both games, the equilibrium concept used is the Nash equilibrium defined 
in the previous seclion. 

We consider the case of differentiated products. (See Prob- 
lem 1.7 for the case of homogeneous products.) If firms 1 and 2 
choose prices ру and pz, respectively, the quantity that consumers 
demand from firm i is 


qi(pi pj) = à — pi + ру, 


where b > 0 reflects the extent to which firm i's product is a sub- 
stitute for firm j's product. (This is an unrealistic demand function 
because demand for firm i’s product is positive even when firm i 
charges an arbitrarily high price, provided firm j also charges a 
high enough price. As will become clear, the problem makes sense 
only if b < 2.)- As in our discussion of the Cournot model, we as- 
sume that there are no fixed costs of production and that marginal 
costs are constant at c, where c « a, and that the firms act (i.e., 
choose their prices) simultaneously. 

As before, the first task in the process of finding the Nash equi- 
librium is to translate the problem into a normal-form game. There 
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are again two players. This time, however, the strategies available 
to each firm are the different prices it might charge, rather than 
the different quantities it might produce. We will assume that 
negative prices are not feasible but that any nonnegative price can 
be charged—there is no restriction to prices denominated in pen- 
nies, for instance, Thus, each firm's strategy space can again be 
represented as 5; = |0, оо), the nonnegative real numbers, and а 
typical strategy s; is now a price choice, р; > 0. 

We will again assume that the payoff function for each firm is 


just its profit. The profit to firm i when it chooses the price p; and 


its rival chooses the price ру is 


"ipi. Pj) = qi(pi Pj) [Pi — с) = [a — pi + bpjllpi — с]. 


Thus, the price pair (pt, рӯ) is a Nash equilibrium if, for each firm i, 
р} solves 


ЕН 54 pw n 
лах rupis pj ) 9 MAX la — pi + bp? lipi — с] 


The solution to firm i's optimization problem is 
* 1 + 
р; = 2 (a + bp; +c). 


Therefore, if the price pair (p},p3) is to be a Nash equilibrium, the 
firms’ price choices must satisfy 


Р Ai ИВ + | 
pi = 3 (0 bp; + с) 


апа 


1 ‘ 
p = 5 (a + bp; + c). 


Solving this pair of equations yields 


i at a BOE 
les da um 





1.2.C Final-Offer Arbitration 


Many public-sector workers are forbidden to strike; instead, wage 
disputes are settled by binding arbitration. (Major league base- 
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ball may be a higher-profile example than the public sector but is 
substantially less important economically) Many other disputes, 
including medical malpractice cases and claims by shareholders 
against their stockbrokers, also involve arbitration. The two ma- 
jor forms of arbitration are conventional and final-offer arbitration. 
In final-offer arbitration, the two sides make wage offers and then 
the arbitrator picks one of the offers as the settlement. In con- 
ventional arbitration, in contrast, the arbitrator is free to impose 
any wage as the settlement. We now derive the Nash equilib- 
rium wage offers іп a model of final-offer arbitration developed 
by Farber (1980). 

Suppose the parties to the dispute are a firm and a union and 
the dispute concerns wages. Let the timing of the game be as 
follows. First, the firm and the union simultaneously make offers, 
denoted by шу and w,, respectively. Second, the arbitrator chooses 
one of the two offers as the settlement. (As in many so-called static 
games, this is really a dynamic game of the kind to be discussed 
in Chapter 2, but here we reduce it to a static game between the 
firm and the union by making assumptions about the arbitrator's 
behavior in the second stage.) Assume that the arbitrator has an 
ideal settlement she would like to impose, denoted by x. Assume 
further that, after observing the parties' offers, ty and шу, the 
arbitrator simply chooses the offer that is closer to x: provided 
that тоу € tu, (as is intuitive, and will be shown to be true), the 
arbitrator chooses шу if x < (шу + w,)/2 and chooses ш, if x > 
(wy + w,)/2; see Figure 1.2.3. (It will be immaterial what happens 
if x = (Wy + 10,)/2. Suppose the arbitrator flips a coin.) 

The arbitrator knows x but the parties do not. The parties 
believe that x is randomly distributed according to a cumulative 
probability distribution denoted by F(x), with associated prob- 
ability density function denoted by f(x). 'Given our specifi- 
cation of the arbitrator's behavior, if the offers are wr and по, 


"This application involves some basic concepts in probability: a cumulative 
probability distribution, a probability density function, and an expected value. 
Terse definitions are given as needed; for more detail, consult any introductory 
probability text. 

“That is, the probability that x is less than an arbitrary value х" is denoted 
Р(х"), and the derivative of this probability with respect to x* is denoted f(x*). 
Since F(x*) is a probability, we have 0 < F(x*) < 1 for any x'. Furthermore, if 
x** > x* then F(x**) > F(x*), so f(x*) > 0 for every x*. 
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w, chosen w, chosen 





(w, * w,)/2 
Figure 1.2.3. 


then the parties believe that the probabilities Prob[w; chosen} and 
Prob(tu, chosen) can be expressed as 


Prob{w, chosen} = Prob {x < = =F (15) 


and 
Ш i 
Prob{w, chosen} = 1 — F (75) i 


Thus, the expected wage settlement is 


тој - Prob[wy chosen} + то, · Prob{wy chosen) 


Р) +w РР). 


We assume that the firm wants to minimize the expected wage 
settlement imposed by the arbitrator and the union wants to max- 
imize it. 
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If the pair of offers (шӯ, шұ) is to be a Nash equilibrium of the 
game between the firm and the union, w? must solve!! 


* * | 
min oF (23) Tu, [ -(#@") 


and tw; must solve 


d? +w ш + ш 
| è | || 2: | E / н 
тах ш 1 5 | + Wy | F( 5 )| i 


Thus, the wage-offer pair (ит, ш) must solve the first-order con- 
ditions for these optimization problems, 


ш? + to, ш? + we 
sn p (E) (ti 


Qo gy 1, (+ Е ш + wy 
(tj, — wy all 2 )- | - (5 à 


(We defer considering whether these first-order conditions are suf- 
ficient.) Since the left-hand sides of these first-order conditions are 
equal, the right-hand sides must also be equal, which implies that 


ш? + Wh 1 
Í и} _ 5. 
4 | 2 | ~ 9’ (2:2:2) 








and 








that is, the average of the offers must equal the median of the 
arbitrator's preferred settlement. Substituting (1.2.2) into either of 
the first-order conditions then yields 


1 
„Гено 
(A) 
that is, the gap between the offers must equal the reciprocal of 


the value of the density function at the median of the arbitrator's 
preferred settlement. 


Wy — Шу = - (1.2.3) 


Nin formulating the firm's and the union's optimization problems, we have 
assumed that the firm's offer is less than the union's offer. It is straightforward 
to show that this inequality must hold in equilibrium. 
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In order to produce an intuitively appealing comparative-static 
result, we now consider an example. Suppose the arbitrator’s pre- 
ferred settlement is normally distributed with mean m and vari- 
ance 27, in which case the density function is 


1 1 
f(x) = бүл OP -ga т) | 


(In this example, one can show that the first-order conditions given 
earlier are sufficient.) Because a normal distribution is symmetric 
around its mean, the median of the distribution equals the mean 
of the distribution, m. Therefore, (1.2.2) becomes 


ш dw, И 
2 
and (1.2.3) becomes 
Wy — Ш = zi = V2702, 


$0 the Nash equilibrium offers are 


fro? 2 
| па па 
ш = т + D. and ш-т- Sx 


Thus, in equilibrium, the parties' offers are centered around the 
‚ expectation of the arbitrator's preferred settlement (ie., т), and 
the gap between the offers increases with the parties' uncertainty 
about the arbitrator's preferred settlement (i.e., 22). 

The intuition behind this equilibrium is simple. Each party 
faces a trade-off. A more aggressive offer (і.е, a lower offer by 
the firm or a higher offer by the union) yields a better payoff if 
it is chosen as the settlement by the arbitrator but is less likely 
to be chosen. (We will see in Chapter 3 that a similar trade-off 
arises in a first-price, sealed-bid auction: a lower bid yields a 
better payoff if it is the winning bid but reduces the chances of 
winning. When there is more uncertainty about the arbitrator's 
preferred settlement (i.e, о? is higher), the parties can afford to 
be more aggressive because an aggressive offer is less likely to be 
wildly at odds with the arbitrator’s preferred settlement. When 
there is hardly any uncertainty, in contrast, neither party can afford 
to make an offer far from the mean because the arbitrator is very 
likely to prefer settlements close to m. 
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1.2.0 The Problem of the Commons 


Since at least Hume (1739), political philosophers and economists 
have understood that if citizens respond only to private incentives, 
public goods will be underprovided and public resources overuti- 
lized. Today, even a casual inspection of the earth’s environment 
reveals the force of this idea. Hardin's (1968) much cited paper 
brought the problem to the attention of noneconomists. Here we 
analyze a bucolic example. 

Consider the 5 farmers in a village. Each summer, all the 
farmers graze their goats on the village green. Denote the number 
of goats the i" farmer owns by g; and the total number of goats 
in the village by С = $1 +... + gy. The cost of buying and caring 
for a goat is c, independent of how many goats a farmer owns. 
The value to a farmer of grazing a goat on the green when a 
total of G goats are grazing is v(G) per goat. Since a goat needs 
at least a certain amount of grass in order to survive, there is 
a maximum number of goats that can be grazed on the green, 
Gmax: U(G) > 0 for G < Gmax but v(G) = 0 for С > Gmax Also, 
since the first few goats have plenty of room to graze, adding one 
more does little harm to those already grazing, but when so many 
goats are grazing that they are all just barely surviving (i.e., G is 
just below Gmax), then adding one more dramatically harms the 
rest. Formally: Юг С < Gmax, (С) < 0 and v"(G) < 0, as in 
Figure 1.2.4. 

During the spring, the farmers simultaneously choose how 
many goats to own. Assume goats are continuously divisible. 
A strategy for farmer i is the choice of a number of goats to 
graze on the village green, gj. Assuming that the strategy space 
is |0, оо) covers all the choices that could possibly be of interest 
to the farmer; |0, Gmax) would also suffice. The payoff to farmer i 
from grazing g; goats when the numbers of goats grazed by the 
other farmers are (21,... 18i- Siti би) 18 


giv(g gii t gi Ri og) — cgi. (1.2.4) 


Thus, if (g1,...,85) is to be a Nash equilibrium then, for each i, 
gj must maximize (1.2.4) given that the other farmers choose 
(915 8i- ради). The first-order condition for this opti- 
mization problem is 


v(gi + 8*1) + giv (gi + 8°) — c — 0, (1.2.5) 


vel 
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Figure 1.2.4. 


where 2”; denotes 21 +++... dg Substituting 
$j into (1.2.5), summing over all n farmers’ first-order conditions, 
and then dividing by n yields 


v(G*) + -G'v(G*) — (1.2.6) 


where С” denotes gj +--+ + gh. In contrast, the social optimum, 
denoted by G**, solves 


max Gw(G) – Ge, 
0<С<со 


the first-order condition for which is 
v(G**) + G**v'(G**) —c=0. (1.2.7) 


Comparing (1.2.6) to (1.2.7) shows!? that С“ > С”: too many 
goats are grazed in the Nash equilibrium, compared to the social 
optimum. The first-order condition (1.2.5) reflects the incentives 
faced by a farmer who is already grazing 9; goats but is consider- 


"Suppose, to the contrary, that С” < С”, Then v(G*) > v(G**), since v' < 0, 
Likewise, 0 > v'(G*) > v'(G'*), since v" « 0, Finally, G*/n < G**. Thus, the 
left-hand side of (1.2.6) strictly exceeds the left-hand side of (1.2.7), which is 
impossible since both equal zero, 
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ing adding one more (or, strictly speaking, a tiny fraction of one 
more). The value of the additional goat is v(g; + 2" ,) and its cost 
isc. The harm to the farmer's existing goats is v'(g;--g* |) per goat, 
or 2,2 (8; + 2" ;) in total. The common resource is overutilized be- 
cause each farmer considers only his or her own incentives, not 
the effect of his or her actions on the other farmers, hence the 
presence of G*v'(G*)/n in (1.2.6) but G**v'(G**) in (1.2.7). 


13 Advanced Theory: Mixed Strategies and 
Existence of Equilibrium 


1.3.A Mixed Strategies 


In Section 1.1.C we defined S; to be the set of strategies available 
to player i, and the combination of strategies (5*,...,5*) to be a 
Nash equilibrium if, for each player i, s? is player i's best response 
to the strategies of the n — 1 other players: 


Ш(51, ... 5-5, Seti .. Жар» Ш(51, ... ‘Si-ti 8h ТЕТІ T (NE) 


for every strategy.s; in S;. By this definition, there is no Nash 
equilibrium in the following game, known as Matching Pennies. 


Player 2 
Heads Tails 


Matching Pennies 


In this game, each player's strategy space is (Heads, Tails). As 
a story to accompany the payoffs in the bi-matrix, imagine that 
each player has a penny and must choose whether to display it 
with heads or tails facing up. If the two pennies match (i.e., both 
are heads up or both are tails up) then player 2 wins player 1's 
penny; if the pennies do not match then 1 wins 2/8 penny. No 


Sel 
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pair of strategies can satisfy (NE), since if the players’ strategies 
match—(Heads, Heads) or (Tails, Tails) —then player 1 prefers to 
switch strategies, while if the strategies do not match—(Heads, 
Tails) or (Tails, Heads)—then player 2 prefers to do so. 

The distinguishing feature of Matching Pennies is that each 
player would like to outguess the other. Versions of this game also 
arise in poker, baseball, battle, and other settings. In poker, the 
analogous question is how often to bluff: if player i is known never 
to bluff then i's opponents will fold whenever i bids aggressively, 
thereby making it worthwhile for i to bluff on occasion; on the 
other hand, bluffing too often is also a losing strategy. In baseball, 
suppose that a pitcher can throw either a fastball or a curve and 
that a batter can hit either pitch if (but only if) it is anticipated 
correctly. Similarly, in battle, suppose that the attackers can choose 
between two locations (or two routes, such as "by land or by sea") 
and that the defense can parry either attack if (but only if) it is 
anticipated correctly. 

In any game in which each player would like to outguess the 
other(s), there is no Nash equilibrium (at least as this equilib- 
rium concept was defined in Section 1.1.C) because the solution 
to such a game necessarily involves uncertainty about what the 
players will do. We now introduce the notion of a mixed strategy, 
which we will interpret in terms of one player's uncertainty about 
what another player will do. (This interpretation was advanced 
by Harsanyi [1973]; we discuss it further in Section 3.2.A.) In the 
next section we will extend the definition of Nash equilibrium 
to include mixed strategies, thereby capturing the uncertainty in- 
herent in the solution to games such as Matching Pennies, poker, 
baseball, and battle. 

Formally, a mixed strategy for player i is a probability distri- 
bution over (some or all of) the strategies in Sj. We will hereafter 
refer to the strategies in S; as player i's pure strategies. In the 
simultaneous-move games of complete information analyzed in 
this chapter, a player's pure strategies are the different actions the 
player could take. In Matching Pennies, for example, 5; consists 
of the two pure strategies Heads and Tails, so a mixed strategy 
for player i is the probability distribution (q,1 — 4), where 4 is 
the probability of playing Heads, 1 — q is the probability of play- 
ing Tails, and 0 € д € 1. The mixed strategy (0, 1) is simply the 
pure strategy Tails; likewise, the mixed strategy (1,0) is the pure 
strategy Heads. 
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As a second example of a mixed strategy, recall Figure 1.1.1, 
where player 2 has the pure strategies Left, Middle, and Right. 
Here a mixed strategy for player 2 is the probability distribution 
(45,1 — 4 — р), where 4 is the probability of playing Left, r is the 
probability of playing Middle, and 1 — q — r is the probability of 
playing Right. As before, 0 < q < 1, and now also 0 <r<land 
0 <q+r <1. In this game, the mixed strategy (1/3, 1/3, 1/3) puts 
equal probability on Left, Middle, and Right, whereas (1/2, 1/2,0) 
puts equal probability on Left and Middle but no probability on 
Right. As always, a player's pure strategies are simply the lim- 
iting cases of the player's mixed strategies—here player 2's pure 
strategy Left is the mixed strategy (1,0,0), for example. 

More generally, suppose that player í has K pure strategies: 
Si = [sir,...,sik). Then a mixed strategy for player í is a prob- 
ability distribution (pi,...,p;k), where Ра is the probability that 
player i will play strategy sa, for k = 1,..., K. Since Pik is а proba- 
bility, we require 0 € py < 1 fork = 1,...,К and ри ++-++pix = 1. 
We will use p; to denote an arbitrary mixed strategy from the set 
of probability distributions over 5), just as we use s; to denote an 
arbitrary pure strategy from 5;. 


Definition In the normal-form game С = (51,.. ‚уб; Uts., My), SUp- 
pose 5; = {si,..., Six}. Thena mixed stra tegy for player i is a probability 
distribution p; = (piy,...,pik), where 0 < Pk € 1fork =1,...,Kand 
pact + pix = 1. 


We conclude this section by returning (briefly) to the notion of 
strictly dominated strategies introduced in Section 1.1.B, so as to 
illustrate the potential roles for mixed strategies in the arguments 
made there. Recall that if a strategy s; is strictly dominated then 
there is no belief that player i could hold (about the strategies 
the other players will choose) such that it would be optimal to 
play s;. The converse is also true, provided we allow for mixed 
Strategies: if there is no belief that player i could hold (about 
the strategies the other players will choose) such that it would be 
optimal to play the strategy s;, then there exists another strategy 
that strictly dominates 5.3 The games in Figures 1.3.1 and 1.3.2 


"Pearce (1984) proves this result for the two-player case and notes that it holds 
for the n-player case provided that the pla yers' mixed strategies are allowed to be 
correlated—that is, player i's belief about what player j will do must be allowed 
to be correlated with Ра belief about what player k will do. Aumann (1987) 
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show that this converse would be false if we restricted attention 
to pure strategies, 

Figure 1.3.1 shows that a given pure strategy may be strictly 
dominated by a mixed strategy, even if the pure strategy is not 
strictly dominated by any other pure strategy. In this game, for 
any belief (q,1— 4) that player 1 could hold about 2's play, 1's best 
response is either T (if q > 1/2) or M (if q < 1/2), but never B. 
Yet B is not strictly dominated by either T or M. The key is that 
B is strictly dominated by a mixed strategy: if player 1 plays T 
with probability 1/2 and M with probability 1/2 then 1° expected 
payoff is 3/2 no matter what (pure or mixed) strategy 2 plays, and 
3/2 exceeds the payoff of 1 that playing B surely produces. This 
example illustrates the role of mixed strategies in finding "another 
strategy that strictly dominates у.” 





suggests that such correlation in i's beliefs is entirely natural, even if j and k 
make their choices completely independently: for example, i may know that 
both j and К went to business school, or perhaps to the same business school, 
but may not know what is taught there. 
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Figure 1.3.2 shows that a given pure strategy can be a best 
response to a mixed strategy, even if the pure strategy is not a 
best response to any other pure strategy. In this game, B is not a 
best response for player 1 to either L or R by player 2, but B is 
the best response for player 1 to the mixed strategy (4,1-4) by 
player 2, provided 1/3 « 4 « 2/3. This example illustrates the role 
of mixed strategies in the “belief that player i could hold." 


1.3.B Existence of Nash Equilibrium 


In this section we discuss several topics related to the existence of 
Nash equilibrium. First, we extend the definition of Nash equi- 
librium given in Section 1.1.C to allow for mixed strategies. Sec- 
ond, we apply this extended definition to Matching Pennies and 
the Battle of the Sexes. Third, we use a graphical argument to 
show that any two-player game in which each player has two 
pure strategies has a Nash equilibrium (possibly involving mixed 
strategies). Finally, we state and discuss Nash's (1950) Theorem, 
which guarantees that any finite game (i.e., any game with a fi- 
nite number of players, each of whom has a finite number of 
pure strategies) has a Nash equilibrium (again, possibly involving 
mixed strategies). : 

Recall that the definition of Nash equilibrium given in Section 
1.1.С guarantees that each player's pure strategy is a best response 
to the other players' pure strategies. To extend the definition to in- 
clude mixed strategies, we simply require that each player's mixed 
strategy be a best response to the other players' mixed strategies. 
Since any pure strategy can be represented as the mixed strategy 
that puts zero probability on all of the player's other pure strate- 
gies, this extended definition subsumes the earlier one. 

Computing player Гз best response to a mixed strategy by 
player j illustrates the interpretation of player j's mixed strategy 
as representing player i's uncertainty about what player j will do. 
We begin with Matching Pennies as an example. Suppose that 
player 1 believes that player 2 will play Heads with probability q 
and Tails with probability 1 — q; that is, 1 believes that 2 will play 
the mixed strategy (q, 1 — q). Given this belief, player 1's expected 
payoffs are q-(—1) + (1—4) :1 = 1 — 2q from playing Heads and 
q:1+(1—q)-(—1) = 2q—1 from playing Tails. Since 1-24 > 24—1 
if and only if q < 1/2, player 1's best pure-strategy response is 
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Heads if q < 1/2 and Tails if q > 1/2, and player 1 is indifferent 
between Heads and Tails if q = 1/2. It remains to consider possible 
mixed-strategy responses by player 1. 

Let (r, 1 — ғ) denote the mixed strategy in which player 1 plays 
Heads with probability r. For each value of q between zero and 
one, we now compute the value(s) of r, denoted r* (4), such that 
(7,1 — r) is a best response for player 1 to (4,1-4) by player 2. 
The results are summarized in Figure 1.3.3, Player 1's expected 
payoff from playing (r,1— r) when 2 plays (q,1 — q) is 


n:(-1)-r(1-4):14(-70g-1-0-»(1 -4)-(-1) 
= (24 – 1) + (2 – 44), (1.3.1) 


where rq is the probability of (Heads, Heads), r(1—4) the probabil- 
ity of (Heads, Tails), and so оп. Since player 1's expected payoff 
is increasing in r if 2 — 49 > 0 and decreasing in r if 2 — 4q « 0, 
player 1's best response is r — 1 (Le., Heads) if < 1/2 апа r = 0 
(i.e., Tails) if > 1/2, as indicated by the two horizontal segments 
of r* (9) in Figure 1.3.3. This statement is stronger than the closely 
related statement in the previous paragraph: there we considered 
only pure strategies and found that if q < 1/2 then Heads is the 
best pure strategy and that if q > 1/2 then Tails is the best pure 
strategy; here we consider all pure and mixed Strategies but again 
find that if < 1/2 then Heads is the best of all (pure or mixed) 
strategies and that if q > 1/2 then Tails is the best of all strategies. 

The nature of player 1's best response to (q,1 — q) changes 
when q = 1/2. As noted earlier, when q = 1/2 player 1 is indif- 
ferent between the pure strategies Heads and Tails. Furthermore, 
because player 1's expected payoff in (1.3.1) is independent of r 
when q = 1/2, player 1 is also indifferent among all mixed strate- 
gies (r,1 — ғ). That is, when q = 1/2 the mixed strategy (r,1 — r) 


"The events A and B are independent if Prob(A and В) = Prob{A}-Prob{B}. 
Thus, in writing rq for the probability that 1 plays Heads and 2 plays Heads, 
we are assuming that 1 and 2 make their choices independently, as befits the 
description we gave of simultaneous-move games. See Aumann (1974) for the 
definition of correlated equilibrium, which applies to games in which the players' 
choices can be correlated (because the players observe the outcome of a random 
event, such as a coin flip, before choosing their strategies). 
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(Heads) 1 


(Tails) 





(Ta ils) (Heads) 


Figure 1.3.3. 


is a best response to (q,1 — 4) for any value of ғ between zero 
and one. Thus, r*(1/2) is the entire interval [0, 1], as indicated 
by the vertical segment of r*(q) in Figure 1.3.3. In the analysis 
of the Cournot model in Section 1.2.A, we called Ri(qj) firm i's 
best-response function. Here, because there exists a value of q 
such that r*(q) has more than one value, we call r*(q) player 178 
best-response correspondence. om 

To derive player i's best response to player j's mixed strategy 
more generally, and to give a formal statement of the extended def- 
inition of Nash equilibrium, we now restrict attention to the two- 
player case, which captures the main ideas as simply as possible. 
Let J denote the number of pure strategies in 51 and К the number 
іп 52. We will write $1 = {s11,...,81)} and 5; = {521,...,52к}, and 
we will use 51; and 52; to denote arbitrary pure strategies from 5) 
and 52, respectively. 

If player 1 believes that player 2 will play the strategies PTT TEE 
боқ) with the probabilities (p2),...,p2x) then player 1’s expected 
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payoff from playing the pure strategy 51; is 
K 
У Portes (51у, 5%), (1.3.2) 
К] 


and player 1's expected payoff from playing the mixed strategy 
Pi = (ри,...,Ру) 18 | 


|| 


] K 
vi (Pi p2) Уру » Ракић evita) 
j=) 


=] 


| К 


> >. Pij * P21 (Sij, 527), (1,3,3) 
је] Ка 


where pij · po is the probability that 1 plays 5 and 2 plays s». 
Player 1's expected payoff from the mixed strategy pı, given in 
(1.3.3), is the weighted sum of the expected payoff for each of the 
pure strategies {511,... 51), given in (1.3.2), where the weights 
are the probabilities (ри, ..„„ру), Thus, for the mixed strategy 
(ри... pij) to be а best response for player 1 to 278 mixed strategy 
рә, it must be that ру; > 0 only if 


K 


K 
У рэки (Siji Sax) > У раки (81, Sa) 
k=l К=1 ‚ 


for every sy in S}. That is, for a mixed strategy to be a best 
response to p» it must put positive probability on a given pure 
strategy only if the pure strategy is itself a best response to pp. 
Conversely, if player 1 has several pure strategies that are best 
responses to p», then any mixed strategy that puts all its probabil- 
ity on some or all of these pure-strategy best responses (and zero 
probability on all other pure strategies) is also a best response for 
player 1 to po. 

To give a formal statement of the extended definition of Nash 
equilibrium, we need to compute player 2's expected payoff when 
players 1 and 2 play the mixed strategies p; and p» respectively. If 
player 2 believes that player 1 will play the strategies (511,...,51/) 
with the probabilities (py, ... (Ра), then player 2's expected pay- 
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off from playing the strategies (521,...,52к) with the probabilities 
(рә1,....Рок) is 


К || 
(рт, p2) У rx ев) 
К=1 f=] | 


= 


= > У рај ' Риз (у, 5%). 
|=1 k=] 


In: 
il 


Given v)(p},p2) and v;(pi,p2) we can restate the requirement of 
Nash equilibrium that each player's mixed strategy be a best re- 
sponse to the other player's mixed strategy: for the pair of mixed 
strategies (pj, p5) to be a Nash equilibrium, p? must satisfy 


Vi (1,2) > vi(pi.p2) (1.3.4) 
for every probability distribution p; over 51, and p; must satisfy 

va(pi Pr) > vo(pt p2) (1.3.5) 
for every probability distribution p»? over S;. 


Definition In the two-player normal-form game С = [S1,85;u1,u5], 
the mixed strategies (pj , рз) area Nash equilibrium if each player's mixed 
strategy is a best response to the other player's mixed strategy: (1.3.4) and 
(1.3.5) must hold, 


We next apply this definition to Matching Pennies and the Bat- 
tle of the Sexes. To do so, we use the graphical representation of 
player i's best response to player j's mixed strategy introduced in 
Figure 1.3.3. To complement Figure 1.3.3, we now compute the 
value(s) of q, denoted 4* (ғ), such that (4,1 — q) is a best response 
for player 2 to (r, 1 r) by player 1. The results are summarized іп 
Figure 1.3.4. If r < 1/2 then 2's best response is Tails, so q*(r) = 0; 
likewise, if r > 1/2 then 2's best response is Heads, so q'(r) 1. 
If r = 1/2 then player 2 is indifferent not only between Heads and 
Tails but also among all the mixed strategies (q,1 — 4), so q* (1/2) 
is the entire interval |0,1). 

After flipping and rotating Figure 1.3.4, we have Figure 1.3.5. 


‚ Figure 1.3.5 is less convenient than Figure 1.3.4 as a representation 
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(Heads) 1 | 


(Tails) EROR: | 
1/2 1 5 


(Tails) (Heads) 


Figure 1.3.4. 
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(Tails) 
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(Tails) (Heads) 


Figure 1.3.5. 
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(Tails) (Heads) 


Figure 1.3.6. 


of player 2's best response to player 1's mixed strategy, but it can 
be combined with Figure 1.3.3 to produce Figure 1.3.6. 

Figure 1.3.6 is analogous to Figure 1.2.1 from the Cournot anal- 
ysis in Section 1.2.A. Just as the intersection of the best-response 
functions (41) and К1(92) gave the Nash equilibrium of the 
Cournot game, the intersection of the best-response correspon- 
dences r*(q) and q*(r) yields the (mixed-strategy) Nash equilib- 
rium in Matching Pennies: if player i plays (1/2,1/2) then 
(1/2,1/2) is a best response for player j, as required for Nash 
equilibrium. 

It is worth emphasizing that such а mixed'strategy Nash equi- 
librium does not rely on any player flipping coins, rolling dice, 
or otherwise choosing a strategy at random. Rather, we interpret 
player j’s mixed strategy as a statement of player i's uncertainty 
about player j's choice of a (pure) strategy. In baseball, for ex- 
ample, the pitcher might decide whether to throw a fastball or a 
curve based on how well each pitch was thrown during pregame 
practice. If the batter understands how the pitcher will make a 
choice but did not observe the pitcher's practice, then the batter 
may believe that the pitcher is equally likely to throw a fastball or a 
curve. We would then represent the batter's belief by the pitcher's 
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mixed strategy (1/2, 1/2), when in fact the pitcher chooses a pure 
strategy based on information unavailable to the batter. Stated 
more generally, the idea is to endow player j with a small amount 
of private information such that, depending on the realization of 
the private information, player j slightly prefers one of the rele- 
vant pure strategies. Since player i does not observe j's private 
information, however, i remains uncertain about /в choice, and 
we represent "з uncertainty by j’s mixed strategy. We provide a 
more formal statement of this interpretation of a mixed strategy 
in Section 3.2.A. | 

As a second example of a mixed-strategy Nash equilibrium, 
consider the Battle of the Sexes from Section 1.1.C. Let (4,1-4) be 
the mixed strategy in which Pat plays Opera with probability q, 
and let (г, 1 — r) be the mixed strategy in which Chris plays Opera 
with probability r. If Pat plays (q,1 — q) then Chris's expected 
payoffs are q -2 + (1-4) 0 = 2] from playing Opera and q -0 + 
(1—q)-1 = 1-9 from playing Fight. Thus, if q > 1/3 then Chris's 
best response is Opera (i.e, r = 1), if q < 1/3 then Chris's best 
response is Fight (i.e, r = 0), and if q = 1/3 then any value of 
ris a best response. Similarly, if Chris plays (r,1 — ғ) then Pat's 
expected payoffs аге r: 1+ (1 – r)-0 = r from playing Opera and 
r-0-F(1—7r)-2 = 2(1— к) from playing Fight. Thus, if r » 2/3 then 
Pat's best response is Opera (i.e., q = 1), if r < 2/3 then Pat's best 
response is Fight (i.e., q = 0), and if r = 2/3 then any value of q 
is a best response. As shown in Figure 1.37, the mixed strategies 
(4,1 — 4) = (1/3,2/3) for Pat and (r,1— ғ) = (2/3, 1/3) for Chris 
are therefore a Nash equilibrium, 

Unlike in Figure 1.3.6, where there was only one intersection 
of the players' best-response correspondences, there are three in- 
tersections of r°(q) and q*(r) in Figure 1.3.7: (а = 0,r = 0) and 
(q = 1, r = 1), as well as (4 = 1/3,r = 2/3). The other two inter- 
sections represent the pure-strategy Nash equilibria (Fight, Fight) 
and (Opera, Opera) described in Section 1.1.C. 

In any game, a Nash equilibrium (involving pure or mixed 
strategies) appears as an intersection of the players' best-response 
correspondences, even when there are more than two players, and 
even when some or all of the players have more than two pure 
strategies. Unfortunately, the only games in which the players' 
best-response correspondences have simple graphical representa- 
tions are two-player games in which each player has only two 
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Figure 1.3.7. 
Player 2 
Left Right 


Figure 1.3.8. 


strategies, We turn next to a graphical argument that any such 
game has a Nash equilibrium (possibly involving mixed strate- 
ies). 

5 Consider the payoffs for player 1 given in Figure 1.3.8. There 
are two important comparisons: x versus z, and y versus w. Based 
on these comparisons, we can define four main cases: (i) x > zand 
y > ш, (ii) x < z and y < w, (iii) > z and у < w, and (iv) x < 2 
and y > w. We first discuss these four main cases, and then turn 
to the remaining cases involving x = z or y= w. 
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4 1 4 
(Left) (Right) (Left) (Right) 


Case (i) ` Case (11) 


Figure 1.3.9, 


In case (D Up strictly dominates Down for player 1, and іп 
vase (D. Down strictly dominates Ор, Recall from the previous 
seclion that a strategy s; is strictly dominated if and only if there 
is no belief that player i could hold (about the strategies the other 
players will choose) such that it would be optimal to play s;. Thus, 
if (0,1 — q) is a mixed strategy for player 2, where q is the prob- 
ability that 2 will play Left, then in case (i) there is no value of 
q such that Down is optimal for player 1, and in case (ii) there is 
no value of q such that Up is optimal. Letting (ғ,1- г) denote 
a mixed strategy for player 1, where r is the probability that 1 
will play Up, we can represent the best-response correspondences 
for cases (i) and (ii) as in Figure 1.3.9. (In these two cases the 
best-response correspondences are in fact best-response functions, 
since there is no value of q such that player 1 has multiple best 
responses.) 

In cases (iii) and (iv), neither Up nor Down is strictly domi- 
nated. Thus, Up must be optimal for some values of д and Down 
optimal for others. Let q’ = (w — y)/(x – z + w — y). Then in 
case (iii) Up is optimal for q > д’ and Down for q < 9', whereas in 
case (iv) the reverse is true. In both cases, any value of r is optimal 
when q = 4. These best-response correspondences are given in 
Figure 1.3.10. 
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Case (tii) Case (iv) 
Figure 1.3.10. 


Since q’ = 1 if x = z and д = 0 if y = w, the best-response 
correspondences for cases involving either x = z or y = w are L- 
shaped (i.e., two adjacent sides of the unit square), as would occur 
in Figure 1.3.10 if q' = 0 or 1 in cases (iii) or (iv). 

Adding arbitrary payoffs for player 2 to Figure 1.3.8 and per- 
forming the analogous computations yields the same four best- 
response correspondences, except that the horizontal axis mea- 
sures r and the vertical q, as in Figure 1.3.4. Flipping and rotating 
these four figures, as was done to produce Figure 1.3.5, yields 
Figures 1.3.11 and 1.3.12. (In the latter figures, r’ is defined anal- 
ogously to 4' in Figure 1.3.10.) 

The crucial point is that given any of the four best-response cor- 
respondences for player 1, r*(q) from Figures 1.3.9 or 1.3.10, and 
any of the four for player 2, q*(r) from Figures 1.3.11 or 1.3.12, the 
pair of best-response correspondences has at least one intersec- 
tion, so the game has at least one Nash equilibrium. Checking all 
sixteen possible pairs of best-response correspondences is left as 
an exercise. Instead, we describe the qualitative features that can 
result. There can be: (1) a single pure-strategy Nash equilibrium; 
(2) a single mixed-strategy equilibrium; or (3) two pure-strategy 
equilibria and a single mixed-strategy equilibrium. Recall from 
Figure 1.3.6 that Matching Pennies is an example of case (2), and 
from Figure 1.3.7 that the Battle of the Sexes is an example of 
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Figure 1.3.11. 
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Figure 1.3.12. 


Advanced Theory 45 





Figure 1.3.13. 


case (3). The Prisoners' Dilemma is an example of case (1); it re- 
sults from combining case (i) or (ii) of r*(g) with case (i) or (ii) 
ог q* (r). 

We conclude this section with a discussion of the existence 
of a Nash equilibrium in more general games. If the above ar- 
guments for two-by-two games are stated mathematically rather 
than graphically, then they can be generalized to apply to n-player 
games with arbitrary finite strategy spaces. 


Theorem (Nash 1950) Ји the n-player normal-form game С = 
(51... би; ui, un), if n is finite and S; is finite for every i then there 
exists at least one Nash equilibrium, possibly involving mixed strategies. 


The proof of Nash's Theorem involves a fixed-point theorem. 
As a simple example of a fixed-point theorem, suppose f(x) is 
a continuous function with domain [0,1] and range [0,1]. Then 
Brouwer's Fixed-Point Theorem guarantees that there exists at 
least one fixed point — that is, there exists at least one value x" 
in [0, 1] such that f(x*) — x*. Figure 1.3.13 provides an example. 


The cases involving x = 2 or y = ш do not violate the claim that the pair of 
best-response correspondences has at least one intersection. On the contrary, in 
addition to the qualitative.features described in the text, there can now be two 
pure-strategy Nash equilibria without a mixed-strategy Nash equilibrium, and a 
continuum of mixed-strategy Nash equilibria. 
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Applying a fixed-point theorem to prove Nash's Theorem in- 
volves two steps: (1) showing that any fixed point of a certain 
correspondence is a Nash equilibrium; (2) using an appropriate 
fixed-point theorem to show that this correspondence must have 
a fixed point. The relevant correspondence is the n-player best- 
response correspondence. The relevant fixed-point theorem is due 
to Kakutani (1941), who generalized Brouwer's theorem to allow 
for (well-behaved) correspondences as well as functions. 

The n-player best-response correspondence is computed from 
the n individual players’ best-response correspondences as fol- 
lows. Consider an arbitrary combination of mixed strategies 
(pi... Pn). For each player i, derive i's best response(s) to the 
other players’ mixed strategies (ру... p;..1, Рут»... Pn). Then con- 
struct the set of all possible combinations of one such best response 
for each player. (Formally, derive each player's best-response 
correspondence and then construct the cross-product of these п 
individual correspondences.) A combination of mixed strategies 
(P1... pp) is a fixed point of this correspondence if (pf,..., py) 
, belongs to the set of all possible combinations of the players' best 
responses to (р1,....р)). That is, for each i, р? must be (one of) 
player i's best response(s) to (pf, ..., Рід, Рзд, еңә Ри), but this 
is precisely the statement that (pf, ..., p?) is a Nash equilibrium. 
This completes step (1). 

Step (2) involves the fact that each player's best-response cor- 
respondence is continuous, in an appropriate sense. The role of 
continuity in Brouwer’s fixed-point theorem can be seen by mod- 
ifying f(x) in Figure 1.3.13: if f(x) is discontinuous then it need 
not have a fixed point. In Figure 1.3.14, for example, f(x) > x for 
all x< x’, but f(x’) < x' for x > x'.!6 

To illustrate the differences between f(x) in Figure 1.3.14 and a 
player's best-response correspondence, consider Case (iii) in Fig- 
ше 1.3.10: at = q', r*(q') includes zero, one, and the entire 
interval in between. (А bit more formally, r* (q^) includes the limit 
of r*(q) as q approaches 4' from the left, the limit of r*(q) as 4 
approaches 4' from the right, and all the values of r in between 
these two limits.) If f(x’) in Figure 1.3.14 behaved analogously to 


‘The value of f(x’) is indicated by the solid circle. The open circle indicates 
that f(x’) does not include this value, The dotted line is included only to indicate 
that both circles occur at x = x’; it does not indicate further values of Го). 
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Figure 1.3.14. 


player 1’s best-response correspondence r*(q’), then f(x‘) would 
include not only the solid circle (as in the figure) but also the 
open circle and the entire interval in between, in which case f(x) 
would have a fixed point at x’, | 

Each player's best-response correspondence always behaves 
the way r* (q') does in Figure 1.3.14: it always includes (the appro- 
priate generalizations of) the limit from the left, the limit from the 
right, and all the values in between. The reason for this is that, 
as shown earlier for the two-player case, if player i has several 
pure strategies that are best responses to the other players' mixed 
strategies, then any mixed strategy p, that puts all its probability 
on some or all of player i's pure-strategy best responses (and zero 
probability on all of player i’s other pure strategies) i$ also a best 
response for player i. Because each player's best-response corre- 
spondence always behaves in this way, the n-player best-response 
correspondence does too; these properties satisfy the hypotheses 
of Kakutani's Theorem, so the latter correspondence has a fixed 
point. 

Nash's Theorem guarantees that an equilibrium exists in a 
broad class of games, but none of the applications analyzed in 
Section 1.2 are members of this class (because each application 
has infinite strategy spaces). This shows that the hypotheses of 
Nash's Theorem are sufficient but not necessary conditions for an 
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equilibrium to exist—there are many games that do not satisfy 
the hypotheses of the Theorem but nonetheless have one or more 
Nash equilibria. 


1.4 Further Reading 


On the assumptions underlying iterated elimination of strictly 
dominated strategies and Nash equilibrium, and on the inter- 
pretation of mixed strategies in terms of the players’ beliefs, see 
Brandenburger (1992). On the relation between (Cournot-type) 
models where firms choose quantities and (Bertrand-type) mod- 
els where firms choose prices, see Kreps and Scheinkman (1983), 
who show that in some circumstances the Cournot outcome occurs 
in a Bertrand-type model in which firms face capacity constraints 
(which they choose, at a cost, prior to choosing prices). On arbitra- 
tion, see Gibbons (1988), who shows how the arbitrator’s preferred 
settlement can depend on the information content of the parties’ 
, offers, in both final-offer and conventional arbitration. Finally, on 
the existence of Nash equilibrium, including pure-strategy equi- 
libria in games with continuous strategy spaces, see Dasgupta and 
Maskin (1986). 


1.5 Problems 


Section 1.1 


11. Whatisa game in normal form? What is a strictly dominated 
strategy in a normal-form game? What is a pure-strategy Nash 
equilibrium in a normal-form game? 


1.2. In the following normal-form game, what strategies survive 
iterated elimination of strictly dominated strategies? What are the 
pure-strategy Nash equilibria? 
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1.3. Players 1 and 2 are bargaining over how to split one dollar. 
Both players simultaneously name shares they would like to have, 
sı and s}, where 0 < 51,52 < 1. Из +s2 < 1, then the players 
receive the shares they named; if sı + $2 > 1, then both players 
receive zero. What are the pure-strategy Nash equilibria of this 
game? 


Section 1.2 


1.4. Suppose there are n firms in the Cournot oligopoly model. 
Let qj denote the quantity produced by firm i, and let Q = 91+...+ 
ди denote the aggregate quantity on the market. Let P denote the 
market-clearing price and assume that inverse demand is given 
by P(Q) = a – (0 (assuming О < а, else P = 0). Assume that the 
total cost of firm i from producing quantity q; is С,(9;) = с. That 
is, there are no fixed costs and the marginal cost is constant at c, 
where we assume c « a. Following Cournot, suppose that the 
firms choose their quantities simultaneously. What is the Nash 
equilibrium? What happens as п approaches infinity? 


L5. Consider the following two finite versions of the Cournot 
duopoly model. First, suppose each firm must choose either half 
the monopoly quantity, qm/2 = (a — c)/4, or the Cournot equilib- 
rium quantity, 4с = (a — c)/3. No other quantities are feasible. 
Show that this two-action game is equivalent to the Prisoners' 
Dilemma: each firm has a strictly dominated strategy, and both 
are worse off in equilibrium than they would be if they cooper- 
ated. Second, suppose each firm can choose either qy,/2, or де, 
or a third quantity, 47. Find a value for 47 such that the game is 
equivalent to the Cournot model in Section 1.2.A, in the sense that 
(де, де) is a unique Nash equilibrium and both firms are worse off 
in equilibrium than they could be if they cooperated, but neither 
firm has a strictly dominated strategy. 


1.6. Consider the Cournot duopoly model where inverse demand 
is P(Q) = а — Q but firms have asymmetric marginal costs: сі 
for firm 1 and c; for firm 2. What is the Nash equilibrium if 
0 <c; < 1/2 for each firm? What ifc < c; <a but 262 > a + c? 


L7. In Section 1.2.B, we analyzed the Bertrand duopoly model 


with differentiated products. The case of homogeneous products 
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yields a stark conclusion. Suppose that the quantity that con- 
sumers demand from firm i is a — р; when Pi < pj, 0 when p; > pj, 
and (a — р;)/2 when p; = pj. Suppose also that there are no fixed 
costs and that marginal costs are constant at c, where c < a. Show 
that if the firms choose prices simultaneously, then the unique 
Nash equilibrium is that both firms charge the price c. 


1.8. Consider a population of voters uniformly distributed along 
the ideological spectrum from left (x — 0) to right (x = 1). Each of 
the candidates for a single office simultaneously chooses a cam- 
paign platform (i.e., a point on the line between x = 0 and x = 1). 
The voters observe the candidates' choices, and then each voter 
votes for the candidate whose platform is closest to the voter's 
position on the spectrum. If there are two candidates and they 
choose platforms ху = .3 and x. = 6, for example, then all 
voters to the left of x — .45 vote for candidate 1, all those to 
the right vote for candidate 2, and candidate 2 wins the elec- 
lion with 55 percent of the vote. Suppose that the candidates 
care only about being elected—they do not really care about their 
platforms at all! If there are two candidates, what is the pure- 
strategy Nash equilibrium? If there are three candidates, exhibit 
a pure-strategy Nash equilibrium. (Assume that any candidates 
who choose the same platform equally split the votes cast for that 
platform, and that ties among the leading vote-getters are resolved 
by coin flips.) See Hotelling (1929) for an early model along these 
lines. 


Section 1.3 


1.9. What is a mixed strategy in a normal-form game? What is a 
mixed-strategy Nash equilibrium in a normal-form game? 


1.10. Show that there are no mixed-strategy Nash equilibria in 
the three normal-form games analyzed in Section 1.1—the Prison- 
ers’ Dilemma, Figure 1.1.1, and Figure 1.1.4. 


1.11. Solve for the mixed-strategy Nash equilibria in the game in 
Problem 1.2. 
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112. Find the mixed-strategy Nash equilibrium of the following 
normal-form game. 


1.13. Each of two firms has one job opening. Suppose that (for 
reasons not discussed here but relating to the value of filling each 
opening) the firms offer different wages: firm i offers the wage wy, 
where (1/2) < шу < 2wi. Imagine that there are two workers, 
each of whom can apply to only one firm. The workers simulta- 
neously decide whether to apply to firm 1 or to firm 2. If only one 
worker applies to a given firm, that worker gets the job; if both 
workers apply to one firm, the firm hires one worker at random 
and the other worker is unemployed (which has a payoff of zero). 
Solve for the Nash equilibria of the workers’ normal-form game. 
(For more on the wages the firms will choose, see Montgomery 


[1991].) 


Worker 2 


Apply to Apply to 
Firm 1 Firm 2 


Apply to Firm 1 
Worker 1 d | 
Apply to Firm 2 





1.14, Show that Proposition B in Appendix 1.1.C holds for mixed- 
as well as pure-strategy Nash equilibria: the strategies played with 
positive probability in a mixed-strategy Nash equilibrium survive 
the process of iterated elimination of strictly dominated strategies. 
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11.1 Introduction 


Because auctions are stylized markets with well-defined rules, modelling them with 
game theory is particularly appropriate. Moreover, several of the motivations be- 
hind auctions are similar to the motivations behind the asymmetric information 
contracts of Part II of this book. Besides the mundane reasons such as speed of 
sale that make auctions important, auctions are useful for a variety of informational 
purposes. Often the buyers know more than the seller about the value of what is 
being sold, and the seller, not wanting to suggest a price first, uses an auction as a 
way to extract information. Art auctions are a good example, because the value of 
a painting depends on the buyer’s tastes, which are known only to himself. 

Auctions are also useful for agency reasons, because they hinder dishonest dealing 
between the seller’s agent and the buyer. If the mayor were free to offer a price for 
building the new city hall and accept the first contractor who showed up, the lucky 
contractor would probably be the one who made the biggest political contribution. 
If the contract is put up for auction, cheating the public is more costly, and the 
difficulty of rigging the bids may outweigh the political gain. 

We will spend most of this chapter on the effectiveness of different kinds of auc- 
tion rules in extracting surplus from buyers, which requires considering the strate- 
gies with which they respond to the rules. Section 11.2 classifies auctions based on 
the relationships between different buyers’ valuations of what is being auctioned. 
Section 11.3 classifies them based on the rules of the auction, and explains the 
optimal strategies under the private-value information structure. Section 11.4 dis- 
cusses optimal strategies under common-value information, which can lead bidders 
into the “winner’s curse” if they are not careful. Section 11.5 is about information 
asymmetry in common-value auctions. 


11.2 Auction Classification and Private-Value Strategies 

Private, Common, and Correlated Values 

Auctions differ enough for an intricate classification to be useful. One way to 
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classify auctions is based on differences in the values buyers put on what is being 
auctioned. We will call the dollar value of the utility that player i receives from an 
object its value to him, V;, and we will call his estimate of its value his valuation, 
Vi 





In a private-value auction, each player knows his value with certainty, although 
he may still have to estimate the values of the other players. An example is the 
sale of antique chairs to people who will not resell them. The values need not be 
independent. If it were common knowledge that the values were either all high or 
all low, for example, that could affect the choice of auction rules by the seller and 
the estimates made by buyers of each others’ values. A player's value equals his 
valuation in a private-value auction. 

If an auction is to be private-value, it cannot be followed by costless resale of 
the object. If there were resale, a bidder's valuation would depend on the price at 
which he could resell, which would depend on the other players' valuations. 

What is special about a private-value auction is that a player cannot extract any 
information about his own value from the valuations of the other players. Knowing 
all the other bids in advance would not change his valuation, although it might well 
change his strategy. The outcomes would be similar even if he had to estimate his 
own value, so long as the behavior of other players did not help him to estimate it, 
so this kind of auction could just as well be called “private-valuation.” 

In à common-value auction, the players have identical values, but each player 
forms his own valuation by estimating with his private information. An example is 
bidding for US Treasury bills. A player's valuation in that auction would change if 
he could sneak a look at the other players' valuations, because they are all trying 
to estimate the same true value. 

The correlated-value auction is a general category which includes the common- 
value auction as an extreme case. In this auction, the valuations of the different 
players are correlated, but their values may differ. Practically every auction we 
see is correlated-value, but, as always in modelling, we must trade off descriptive 
accuracy against simplicity, and private-value vs. common-value is an appropriate 
simplification. 


Auction Rules and Private-Value Strategies 


Auctions have as many different sets of rules as poker games do. Cassady's 1967 
book describes a myriad of rules, but here I will just list the main varieties and de- 
scribe the equilibrium private-value strategies. In teaching this material, I ask each 
student to pick a valuation between 80 and 100, after which we conduct auctions 
of the various kinds. I advise the reader to try this. Pick two valuations and try 
out sample strategy combinations for the different auctions as they are described. 
Even though the values are private, it will immediately become clear that the best- 
response bids still depend on the strategies the bidder thinks other players have 
adopted. 
The types of auctions to be described are: 


(1) English. 
(2) First-price sealed-bid. 
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(3) Second-price sealed-bid. 
(4) Dutch. 


(1) English (first-price open-cry) 


Rules. Each bidder is free to revise his bid upwards. When no bidder wishes to 
revise his bid further, the highest bidder wins the object and pays his bid. 


Strategies. A player’s strategy is his series of bids as a function of (1) his value, 
(2) his prior estimate of other players’ valuations, and (3) the past bids of all the 
players. His bid can therefore be updated as his information set changes. 


Payoffs. The winner’s payoff is his value minus his highest bid. 


A player's dominant strategy in a private-value English auction is to keep bidding 
some small amount epsilon more than the previous high bid until he reaches his 
valuation, and then to stop. This is optimal because he always wants to buy the 
object if the price is less than its value to him, but he wants to pay the lowest 
price possible. All bidding ends when the price reaches the valuation of the player 
with the second-highest valuation. The optimal strategy is independent of risk 
neutrality if players know their own values with certainty rather than estimating 
them, although risk-averse players who must estimate their values should be more 
conservative in bidding. 

In correlated-value open-cry auctions, the bidding procedure is important. The 
most common procedures are (1) for the auctioneer to raise prices at a constant 
rate, (2) for him to raise prices at whatever rate he thinks appropriate, and (3) 
for the bidders to raise prices as I specified in the rules above. A fourth procedure 
is often the easiest to model: the open exit auction, in which the price rises 
continuously and players must publicly announce that they are dropping out (and 
cannot reenter) when the price becomes unacceptably high. In an open exit auction 
the players have more evidence available about each others' valuations than if they 
could drop out secretly. 


(2) First-price sealed-bid 


Rules. Each bidder submits one bid, in ignorance of the other bids. The highest 
bidder pays his bid and wins the object. 


Strategies. A player's strategy is his bid as a function of his value and his prior 
beliefs about other players' valuations. 


Payoffs. The winner’s payoff is his value minus his bid. 
Suppose Smith's value is 100. If he bid 100 and won when the second bid was 


80, he would wish that he had bid only 80 +e. If it is common knowledge that the 
second-highest value is 80, Smith's bid should be 80 + є. If he is not sure about 
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the second-highest value, the problem is difficult and no general solution has been 
discovered. The tradeoff is between bidding high—thus winning more often—and 
bidding low—thus benefiting more if the bid wins. The optimal strategy, whatever 
it may be, depends on risk neutrality and beliefs about the other bidders, so the 
equilibrium is less robust than the equilibria of English and second-price auctions. 

Nash equilibria can be found for more specific first-price auctions. Suppose that 
there are N risk-neutral bidders, and that Nature assigns them values independently 
using а uniform density from 0 to some amount v. Denote player їз value by v, 
and let us consider the strategy for player 1. If some other player has a higher value, 
then in a symmetric equilibrium player 1 is going to lose the auction anyway, so we 
can ignore that possibility in finding his optimal bid. Player 1's equilibrium strategy 
is to bid epsilon above his expectation of the second-highest value, conditional on 
his bid being the highest (i.e., assuming that no other bidder has a value over v). 

If we assume that v, is the highest value, the probability density for player 2's 
value, which is uniformly distributed between 0 and v,, equals v is 1/9), and the 
probability that v2 is less than or equal to v is v/v,. The probability that v lies 
in [v, v 4- dv| and is the second-highest value is 


Ртоб(иг = v)Prob(v, < v)Prob(v, € v)--- Prob(vy < v)dv, (11.1) 


DOT за 


Since there are N — 1 players besides player 1, the probability that one of them 
has the value v, and v is the second-highest is N — 1 times expression (11.2). The 
expectation of v is the integral of v over the range 0 to vy, 


which equals 


E(v) = | v(N — 1)(1/vi)|v/vi]" Зао 


= (N — Da | | 9М-1 9 (11.3) 
vi Jo 


= (N = 1) 


Thus we find that player 1 ought to bid a fraction === of his own value, plus 
epsilon. 

The previous example is an elegant result, but it is not a general rule. Suppose 
Smith knows that Brown's value is 0 or 100 with equal probability, and Smith's 
value of 400 is known by both players. Brown bids either 0 or 100 in equilibrium, 
and Smith always bids (100 + є), because his value is so high that winning is more 
important than paying a low price. 

: If Smith's value were 102 instead of 400, the equilibrium would be much different. 
Smith would use a mixed strategy, and while Brown would still offer 0 if his value 
were 0, if his value were 100 he would use a mixed strategy too. No pure strategy 
could be part of a Nash equilibrium, because if Smith always bid a value z « 100, 
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Brown would always bid = + Е, in which case Smith would deviate to т + 2Е, and 
if Smith bid z > 100 he would be paying 100 more than necessary half the time. 


(3) Second-price sealed-bid 


Rules. Each bidder submits one bid, in ignorance of the other bids. The bids 
are opened, and the highest bidder pays the amount of the second-highest bid and 
wins the object. 


Strategies. A player’s strategy is his bid as a function of his value and his prior 
belief about other players’ valuations. 


Payoffs. The winner’s payoff is his value minus the second-highest bid that was 
made. 


Second-price auctions are similar to English auctions. I have never heard of 
them actually being carried out, but they are useful for modelling. Bidding one’s 
valuation is the dominant strategy: a player who bids less is more likely to lose 
the auction, but pays the same price if he does win. The structure of the payoffs 
is reminiscent of the Groves mechanism of Section 7.7, because in both games a 
player’s strategy affects some major event (who wins the auction or whether the 
project is undertaken), but his strategy affects his own payoff only via that event. 
In the auction’s equilibrium, each player bids his value and the winner ends up 
paying the second-highest value. If players know their own values, the outcome 
does not depend on risk neutrality. 


(4) Dutch (Descending) 


Rules. The seller announces a bid, which he continuously lowers until some buyer 
stops him and takes the object at that price. 


Strategies. A player’s strategy is when to stop the bidding as a function of his 
valuation and his prior beliefs as to other players’ valuations. 


Payoffs. The winner’s payoff is his value minus his bid. 


The Dutch auction is strategically equivalent to the first-price sealed-bid 
auction, which means that there is a one-to-one mapping between the strategy sets 
and the equilibria of the two games. The reason for the strategic equivalence is 
that no relevant information is disclosed in the course of the auction, only at the 
end, when it is too late to change anybody’s behavior. In the first-price auction a 
player’s bid is irrelevant unless it is the highest, and in the Dutch auction a player’s 
stopping price is also irrelevant unless it is the highest. The equilibrium price is 
calculated the same way for both auctions. 

Dutch auctions are actually used. One example is the Ontario tobacco auction, 
which uses a clock four feet in diameter marked with quarter cent gradations. Each 
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of six or so buyers has a stop button. The clock hand drops a quarter cent at a 
time, and the stop buttons are registered so that ties cannot occur (tobacco buyers 
need reflexes like race-car drivers). The farmers who are selling their tobacco watch 
from an adjoining room and can later reject the bids if they feel they are too low 
(а form of reserve price). 2,500,000 Ib. a day can be sold using the clock (Cassady 
[1967] p. 200). 

Dutch auctions are common in less obvious forms. Filene's is one of the biggest 
stores in Boston, and Filene's Basement is its most famous department. In the 
basement are a variety of marked-down items formerly in the regular store, each 
with a price and date attached. The price customers pay at the register is the price 
on the tag minus a discount which depends on how long ago the item was dated. 
As time passes and the item remains unsold, the discount rises from 10 to 50 to 
70 percent. The idea of predictable time discounting has also recently been used 
by bookstores (^Waldenbooks to Cut Some Book Prices in Stages in Test of New 
Selling Tactic," Wall Street Journal, 29 March 1988, p. 34). 


11.3 Comparing Auction Rules 
Equivalence Theorems 


When one mentions auction theory to an economic theorist, the first thing that 
springs to his mind is the idea that various kinds of auctions are the same in some 
sense. Milgrom & Weber (1982) give a good summary of how and why this is 
true. Regardless of the information structure, the Dutch and first-price sealed-bid 
auctions are the same in the sense that the strategies and the payoffs associated with 
the strategies are the same. That equivalence does not depend on risk neutrality, 
but let us assume that all players are risk neutral for the next few paragraphs. 

In private independent-value auctions, the second-price sealed-bid and English 
auctions are the same in the sense that the bidder who values the object most 
highly wins and pays the valuation of the second-highest valuer, but the strategies 
are different in the two auctions. In all four kinds of private independent-value 
auctions discussed, the seller's expected price is the same. This fact is the biggest 
result in auction theory: the revenue equivalence theorem (Vickrey [1961]). 

The revenue equivalence theorem does not imply that in every realization of the 
game all four auction rules yield the same price, only that the expected price is the 
same. The difference arises because in the Dutch and first-price sealed-bid auctions, 
the winning bidder has estimated the value of the second-highest bidder, and that 
estimate, while correct on average, is above or below the true value in particular 
realizations. The variance of the price is higher in those auctions because of the 
additional estimation, which means that a risk-averse seller should use the English 
or second-price auction. 

Whether the auction is private-value or not, the Dutch and first-price sealed-bid 
auctions are strategically equivalent. If the auction is correlated-value and there are 
three or more bidders, the open exit English auction leads to greater revenue than 
the second-price sealed-bid auction, and both yield greater revenue than the first- 
price sealed-bid auction (Milgrom & Weber [1982]). If there are just two bidders, 
however, the open exit English auction is no better than the second-price sealed- 
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that is, does the auctioned object end up in the hands of whoever values it most? 
In a common-value auction this is not an interesting question, because all bidders 
value the object equally. In a private-value auction, all of the auction forms— 
first-price, second-price, Dutch, and English—are Pareto-optimal. They are also 
optimal in a correlated-value auction if all players draw their information from the 
same distribution and the equilibrium is in symmetric strategies. 

Auctions with risk-averse bidders are difficult to analyze. One known fact is 
that in a private-value auction the first-price sealed-bid auction yields a greater 
expected revenue than the English or second-price auctions. That is because by 
increasing his bid from the level optimal for a risk-neutral bidder, the risk-averse 
bidder insures himself. If he wins, his surplus is slightly less because of the higher 
price, but he is more likely to win and avoid a surplus of zero. Thus, the buyers’ 
risk aversion helps the seller. 


Hindering Buyer Collusion 


As I mentioned at the start of this chapter, one motivation for auctions is to dis- 
courage collusion between players. Some auctions are more vulnerable to this than 
others. Robinson (1985) has pointed out that whether the auction is private-value 
or common-value, the first-price sealed-bid auction is superior to the second-price 
sealed-bid or English auctions for deterring collusion among bidders. 

Consider a buyer’s cartel in which buyer Smith has a private value of 20, the 
other buyers’ values are each 18, and they agree that everybody will bid 5 except 
Smith, who will bid 6 (we will not consider the rationality of this choice of bids, 
which might be based on avoiding legal penalties). In an English auction this is 
self-enforcing, because if somebody cheats and bids 7, Smith is willing to go all 
the way up to 20 and the cheater will end up with no gain from his deviation. 
Enforcement is also easy in a second-price sealed-bid auction, because the cartel 
agreement can be that Smith bids 20 and everyone else bids 6. 

In a first-price sealed-bid auction, however, it is hard to prevent buyers from 
cheating on their agreement in a one-shot game. Smith does not want to bid 20, 
because he would have to pay 20, but if he bids anything less than the other players’ 
value of 18 he risks them overbidding him. The buyer will end up with a price of 
18, rather than the 6 he would receive in an English auction with collusion. 


11.4 Common-Value Auctions and the Winner’s Curse 


In Section 11.2 we distinguished private-value auctions from common-value auc- 

tions, in which the values of the players are identical but their valuations may 

differ. All four sets of rules discussed in Section 11.3 can be used for common-value 

auctions, but the optimal strategies are different. In common-value auctions, the 

players can extract useful information about the object’s value to themselves from 

| the bids of the other players. Surprisingly enough, a buyer can use the information 
from other buyers’ bids even in a sealed-bid auction, as will be explained below. 
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bid auction, because the open exit feature—knowing when non-bidding players drop 
out—is irrelevant. 

A question of less practical interest is whether an auction form is Pareto-optimal; 
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When I teach this material I bring a jar of pennies to class and ask the students 
to bid for it in an English auction. All but two of the students get to look at the 
jar before the bidding starts, and everybody is told that the jar contains more than 
9 and less than 100 pennies. Before the bidding starts, I ask each student to write 
down his best guess of the number of pennies. The two students who do not get to 
see the jar are like "technical analysts," those peculiar people who try to forecast 
stock prices using charts showing the past movements of the stock while remaining 
ignorant of the stock's “fundamentals.” 

A common-value auction in which all the bidders knew the value would not be 
very interesting, but more commonly, as in the penny jar example, the bidders 
must estimate the common value. The obvious strategy, especially following our 
discussion of private-value auctions, is for a player to bid up to his unbiased estimate 
of the number of pennies in the jar. But in fact this strategy makes the winner's 
payoff negative, because the winner is the bidder who has made the largest positive 
error in his valuation. The bidders who underestimated the number of pennies, on 
the other hand, lose the auction, but their payoff is limited to a downside value of 
zero, which they would receive even if the true value were common knowledge. Only 
the winner suffers from overbidding: he has stumbled into the winner’s curse. 
When other players are better informed, it is even worse for an uninformed player 
to win. Anyone, for example, who wins an auction against 50 experts should worry 
about why they all bid less. 

To avoid the winner's curse, players should scale down their estimates to form 
their bids. The mental process is a little like deciding how much to bid in a private- 
value first-price sealed-bid auction, in which bidder Smith estimates the second- 
highest value conditional upon himself having the highest value and winning. In 
the common-value auction, Smith estimates his own value, not the second-highest, 
conditional upon himself winning the auction. He knows that if he wins using his 
unbiased estimate, he probably bid too high, so after winning with such a bid he 
would like to retract it. Ideally, he would submit a bid of [X if I lose, but (X — Y) 
if I win], where X is his valuation conditional upon losing and (X — У) is his lower 
valuation conditional upon winning. If he still won with a bid of (X — Y) he would 
be happy; if he lost, he would be relieved. But Smith can achieve the same effect 
by simply submitting the bid (X — У) in the first place, since the size of losing bids 
is irrelevant. 

Another explanation of the winner's curse can be devised from the Milgrom def- 
inition of "bad news" (Milgrom [1981b], note N6.5). Suppose that the government 
is auctioning off the mineral rights to a plot of land with common value V, and 
bidder i has valuation V;. Suppose also that the bidders are identical in everything 
but their valuations, which are based on the various information sets Nature has 
assigned them, and that the equilibrium is symmetric, so the equilibrium bid func- 
tion b(V;) is the same for each player. If Bidder 1 wins with a bid ВУ) that is 
based on his prior valuation V;, his posterior valuation У, is 


V = E(V|V, b(V2) < (Vi), ...) b(V..) < Ы(Ӯ,)). (11-4) 


The news that b(V;) < oo would be neither good nor bad, since it conveys no 
information, but the information that b(V2) < b(V,) is bad news, since it rules out 
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values of b more likely to be produced by large values of V2. In fact, the lower the 
value of b(V,), the worse is the news of having won. Hence, H 


й < КУМ) =Й, (11.5) 


and if Bidder 1 had bid БИ) = V; he would immediately regret having won. If his 
winning bid were enough below У, however, he would be pleased to win. 

| Deciding how much to scale down the bid is a hard problem because the amount 
depends on how much all the other players scale down. In a second-price auction 
a player calculates the value of V, using equation (11.4), but that equation hides 
considerable complexity under the disguise of “b(V2),” which is itself calculated as 
a function of 6(V,) using an equation like (11.4). 


Oil Tracts and the Winner's Curse 


The best known example of the winner's curse is from bidding for offshore oil tracts. 
Offshore drilling can be unprofitable even if oil is discovered, because something 
must be paid the government for the mineral rights. Capen, Clapp, & Campbell 
(1971) suggest that bidders' ignorance of the winner's curse caused overbidding in 
US government auctions of the 1960s. If the oil companies bid close to what their 
engineers estimated the tracts were worth, rather than scaling down their bids, the 
winning companies would lose on their investments. The hundredfold difference 
in the sizes of the bids in the sealed-bid auctions shown in Table 11.1 lends some 
plausibility to the view that this happened. 


Table 11.1 Bids by serious competitors in oil auctions 


Channel Texas North Slope 
1968 1968 1969 


| Santa Barbara Offshore Alaska 
Tract 375 Tract 506 





Note: All bids are in millions of dollars 
Source: Capen, Clapp, & Campbell (1971) 
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Later studies such as Mead et al. (1984) that actually looked at profitability 
conclude that the rates of return from offshore drilling were not abnormally low, 
so perhaps the oil companies did scale down their bids rationally. The spread in 
bids is surprisingly wide, but that does not mean that the bidders did not properly 
scale down their estimates. Although expected profits are zero under optimal bid- 
ding, realized profits could be either positive or negative. With some probability, 
one bidder makes a large overestimate which results in too high a bid even after 
rationally adjusting for the winner’s curse. The knowledge of how to bid optimally 
does not eliminate bad luck; it only mitigates its effects. 

Another consideration is irrationality of the other bidders. If bidder Allied has 
figured out the winner’s curse, but bidders Brydox and Central have not, what 
should Allied do? Its rivals will overbid, which affects Allied’s best response. Allied 
should scale down its bid even further, because the winner’s curse is intensified 
against overoptimistic rivals. If Allied wins against a rival who standardly overbids, 
Allied has very likely overestimated the value. 

Risk aversion affects bidding in a surprisingly similar way. If all the players were 
equally risk averse, the bids would be lower, because the asset is a gamble, whose 
value is lower for the risk averse. If Smith is more risk averse than Brown, then 
Smith should be more cautious for two reasons: the direct reason that the gamble 
is worth less to Smith, and the indirect reason that when Smith wins against a 
rival like Brown who regularly bids more, Smith probably overestimated the value. 
Parallel reasoning holds if the players are risk neutral, but the private value of the 
object differs among them. 

Asymmetric equilibria can even arise when the players are identical. Second- 
price two-person common-value auctions usually have many asymmetric equilibria 
besides the symmetric equilibrium we have been discussing (see Milgrom [1981c] 
and Bikhchandani [1988]). Suppose that Smith and Brown have identical payoff 
functions, but Smith thinks Brown is going to bid aggressively. The winner’s curse 
is intensified for Smith, who would probably have overestimated if he won against 
an aggressive bidder like Brown, so Smith bids more cautiously. But if Smith 
bids cautiously, Brown is safe in bidding aggressively, and there is an asymmetric 
equilibrium. For this reason, acquiring a reputation for aggressiveness is valuable. 

Oddly enough, if there are three or more players the sealed-bid second-price 
common-value auction has a unique equilibrium, which is symmetric. The open exit 
auction is different: it has asymmetric equilibria, because after one bidder drops 
out, the two remaining bidders know that they are alone together in a subgame 
which is a two-player auction. Regardless of the number of players, first-price 
sealed-bid auctions do not have this kind of asymmetric equilibrium. Threats in a 
first-price auction are costly because the high bidder pays his bid even if his rival 
decides to bid less in response. Thus, а bidder’s aggressiveness is not made safer 
by intimidation of another bidder. 

The winner’s curse crops up in situations seemingly far removed from auctions. 
An employer must beware of hiring a worker passed over by other employers. Some- 
one renting an apartment must hope that he is not the first visitor who arrived when 
the neighboring trumpeter was asleep. A firm considering a new project must worry 
that the project has been considered and rejected by competitors. The winner’s 
curse can even be applied to political theory, where certain issues keep popping up. 
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Opinions are like estimates, and one interpretation of different valuations is that 
everyone gets the same data, but they analyze it differently. 

On a more mundane level, as I write this in 1987, four major candidates—Bush, 
Kemp, Dole, and Other—are running for the Republican nomination for President 
of the United States. Consider an entrepreneur auctioning off four certificates, each 
paying one dollar if its particular candidate wins the nomination. If every bidder is 
rational, the entrepreneur should receive a maximum of one dollar in total revenue 
from these four auctions, and less if bidders are risk averse. But holding the auction 
in a bar full of partisans, how much do you think he would actually receive? 


11.5 Information in Common-Value Auctions 
The Seller’s Information 


Milgrom & Weber (1982) have found that honesty is the best policy as far as the 
seller is concerned. If it is common knowledge that he has private information, 
he should release it before the auction. The reason is not that the bidders are 
risk averse (though perhaps this strengthens the result), but the “No News is Bad 
News” result of Section 7.1. If the seller refuses to disclose something, buyers know 
that the information must be unfavorable, and an unravelling argument tells us 
that the quality must be the very worst possible. 

Quite apart from unravelling, another reason to disclose information is to mit- 
igate the winner’s curse, even if the information just reduces uncertainty over the 
value without changing its expectation. In trying to avoid the winner’s curse, bid- 
ders lower their bids, so anything which makes it less of a danger raises their bids. 


Asymmetric Information Among the Buyers 


If bidder Smith knows he has uniformly worse information than bidder Brown (that 
is, his information partition is coarser than Brown’s), then he should stay out of 
the auction: his expected payoff is negative if Brown expects zero profits. 

If Smith’s information is not uniformly worse, then he can still benefit by entering 
the auction. Having independent information, in fact, is more valuable than having 
good information. Consider a common-value first-price sealed-bid auction with 
four bidders. Bidders Smith and Black have the same good information, Brown 
has that same information plus an extra signal, and Jones usually has only a poor 
estimate, but one different from any other bidder’s. Smith and Black should drop 
out of the auction—they can never beat Brown without overpaying. But Jones will 
sometimes win, and his expected surplus is positive. If, for example, real estate 
tracts are being sold, and Jones is quite ignorant of land values, he can still do 
well if on rare occasions he has inside information concerning the location of a new 
freeway, even though ordinarily he should refrain from bidding. If Smith and Black 
both use the same appraisal formula, they will compete each other’s profits away, 
and if Brown uses the formula plus extra private information, he drives their profits 
negative by taking some of the best deals from them and leaving the worst ones. 

In general, a bidder should bid less if there are more bidders or his information is 
absolutely worse (that is, his information partition is coarser). He should also bid 
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less if parts of his information partition are coarser than those of his rivals, even if 
his information is not uniformly worse. These considerations are most important in 
sealed-bid auctions, because in an open-cry auction, information is revealed while 
other bidders still have time to act on it. 
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Problem 11 
An Auction with Stupid Bidders 


Smith's value for an object has a private component equal to 1 and also a component 
common with Jones and Brown. Jones's and Brown's private components equal 
zero. Each player estimates the common component Z independently, and player 
i's estimate is either z; above the true value ог т; below, with equal probability. 
Jones and Brown are naive and always bid their valuations. The auction is English. 


(1) lIfzsai = 0, what is Smith's dominant strategy if his estimate of 2 equals 
207 

(2) Ifz; = 8 for all players and Smith estimates Z = 20, what are the probabilities 
that he puts on different values of Z? 

(3) IfSmith knows that 2 = 12 with certainty, what are the probabilities he puts 
on the different combinations of bids by Jones and Brown? 

(4) Why is 8.72 a better upper limit on bids for Smith than 21, if his estimate of 
2 is 20, and т; = 8 for all three players? (compute the payoffs from the two 
strategies) 


Notes 
N11.1 Introduction 


е McAfee & McMillan (1987) and Milgrom (1987) are excellent surveys of the literature 
and theory of auctions. Both articles take some pains to relate the material to models of 
asymmetric information. Milgrom & Weber (1982) is a classic article that covers many 
aspects of auctions. 

е Auctions look like tournaments in that the winner is the player who chooses the largest 


Auctions 257 





amount for some costly variable, but in auctions the losers generally do not incur costs 
proportional to their bids. Shubik (1971), however, has suggested an auction in which 
both the first and the second-highest bidders pay the second price. If both players begin 


with infinite wealth, the game illustrates why equilibrium might not exist if strategy ` 


sets are unbounded. Once one bidder has started bidding against another, both of them 
do best by continuing to bid so as to win the prize as well as pay the bid. This auction 
may seem absurd, but it has some similarity to patent races (see Section 13.4) and arms 
races. 


N11.3 Auction Classification and Private-Value Strategies 


e Cassady (1967) is an excellent source of institutional detail on auctions. The appendix 
to his book includes advertisements and sets of auction rules, and he cites numerous 
newspaper articles. See also New York Times, 26 July 1985, p. 23; New York Times, 31 
July 1985, pp. 1, 11; Wall Street Journal, 24 August 1984, pp. 1, 16; and "The Crackdown 
on Colluding Roadbuilders," Fortune, 3 October 1983. 

е One might think that а second-price open-cry auction would come to the same results 
as a first-price open-cry auction, because if the price advances by epsilon at each bid, 
the first and second bids are practically the same. But the second-price auction can be 
manipulated. If somebody initially bids $10 for something worth $80, another bidder 
could safely bid $1,000. No one else would bid more, and he would pay only the second 
price: $10. 

е In one variant of the English auction, the auctioneer announces each new price and a 
bidder can hold up a card to indicate he is willing to bid that price. This rule is practical 
to administer in large crowds and it also allows the seller to act strategically during the 
course of the auction. If, for example, the two highest valuations are 100 and 120, this 
auction could yield a price of 110, while the usual rules would only allow a price of 
100 + €. 

e Vickrey (1961) notes that a Dutch auction could be set up as a second-price auction. 
When the first bidder presses his button, he primes a buzzer that goes off when a second 
bidder presses a button. 

e After the last bid of an open-cry art auction in France, the representative of the Louvre 
has the right to raise his hand and shout “pre-emption de l'etat," after which he takes 
the painting at the highest price bid (The Economist, 23 May 1987, p. 98). How does 
that affect the equilibrium strategies? What would happen if the Louvre could resell? 

e Share Auctions. In a share auction each buyer submits a bid for both a quantity and 
a price. The bidder with the highest price receives the quantity for which he bid at 
that price. If any of the product being auctioned remains, the bidder with the second- 
highest price takes the quantity he bid for, and so forth. The rules of a share auction can 
allow each buyer to submit several bids, often called a schedule of bids. The details of 
share auctions vary, and they can be either first-price or second-price. Models of share 
auctions are very complicated; see R. Wilson (1979). 

e Reserve prices are prices below which the seller refuses to sell. They can increase 
the seller’s revenue, and their effect is to make the auction more like a regular fixed- 
price market. For discussion, see Milgrom & Weber (1982). They are also useful when 
buyers collude, a situation of bilateral monopoly (see “At Many Auctions, Illegal Bidding 
Thrives as a Longtime Practice Among Dealers,” Wall Street Journal, 19 February 1988, 
p. 17.) 

In some real-world English auctions, the auctioneer does not announce the reserve 
price in advance, and he starts the bidding below it. This can be explained as a way 
to allow bidders to show each other that their valuations are greater than the starting 
price, even though it may turn out that they are all lower than the reserve price. 

е Concerning auctions with risk-averse players, see Maskin & Riley (1984). 
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N11.4 Common-Value Auctions апа the Winner’s Curse 


е Even if valuations are correlated, the optimal bidding strategies can still be the same as 
in private-value auctions if the values are independent. If everyone overestimates their 
values by 10 percent, then a player can still extract no information about his value by 
seeing other players' valuations. 

е "Getting carried away" may be a rational feature of a common-value auction. If a bidder 
has a high private value, and then learns from the course of the bidding that the common 
value is larger than he thought, he may well end up paying more than he had planned, 
although he would not regret it afterwards. Other explanations for why bidders seem 
to pay too much are the winner's curse and the fact that in every auction all but one or 
two of the bidders thinks that the winning bid is greater than the value of the object. 

е Milgrom & Weber (1982) use the concept of affiliated variables in classifying auctions. 
Roughly speaking, random variables X and Y are affiliated if a larger value of X means 
that a larger value of Y is more likely, or at least no less likely. Independent random 
variables are affiliated. 





4 Dynamic Games with Symmetric 
Information | 


4.1 Introduction 


In this chapter we will make heavy use of the extensive form to study games with 
a sequence of moves. We start in Section 4.2 with a refinement of the Nash equilib- 
rium concept called perfectness that incorporates sensible implications of the order 
of moves. Perfectness is illustrated in Section 4.3 with a game of entry deterrence. 
Having established a way to deal with moves over time, we will analyze repeated 
games in Section 4.4 and use the Chainstore Paradox to show the perverse unim- 
portance of repetition for the Prisoner’s Dilemma. Section 4.5 shows how to model 
discounting: players who value future consumption less than present consumption. 
Neither discounting, probabilistic end dates, infinite repetitions, nor precommit- 
ment are satisfactory escapes from the Chainstore Paradox, and the Folk Theorem 
described in Section 4.6 explains why. Section 4.7 builds a framework for reputation 
models based on the Prisoner’s Dilemma, and Section 4.8 presents one particular 
reputation model, the Klein-Leffler model of product quality. 


4.2  Subgame Perfectness 
The Perfect Equilibrium of Follow the Leader I 


Subgame perfectness is an equilibrium concept based on the ordering of moves and 
the distinction between an equilibrium path and an equilibrium. The equilibrium 
path is the path through the game tree that is followed in equilibrium, but the 
equilibrium itself is a strategy combination, which includes the players’ responses to 
other players' deviations from the equilibrium path. These off-equilibrium responses 
are very important to decisions on the equilibrium path. A threat, for example, is a 
promise to carry out a certain action if another player deviates from his equilibrium 
actions. i 
Perfectness is best introduced with an example. In Section 2.1, a flaw of Nash 
equilibrium was revealed in the game Follow the Leader I, which has three pure 
strategy Nash equilibria, only one of which is reasonable. The players are Smith 
and Brown, who choose disk sizes. Both their payoffs are greater if they choose 
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the same size, and greatest if they coordinate on the size Large. Smith moves 
first, so his strategy set is (Small, Large}. Brown's strategy is more complicated, 
because it must specify an action for each information set, and Brown’s information 
set depends on what Smith chose. A typical element of Brown’s strategy set is 
(Large, Small), which specifies that he chooses Large if Smith chose Large, and 
Small if Smith chose Smail. From the normal form we found the following three 
Nash equilibria. 


Equilibrium Strategies Outcome 
X {Large, (Large, Large)) Both pick Large. 
Y {Large, (Large, Small)} Both pick Large. 
Z (Small, (Small, Small)) Both pick Small. 


Only Equilibrium Y is reasonable, because the order of the moves should matter 
to the decisions players make. The problem with the normal form, and thus with 
simple Nash equilibrium, is that it ignores who moves first. Smith moves first, and 
it seems reasonable that, Brown should be allowed—in fact should be required—to 
rethink his strategy after Smith moves. 

Figure 4.1 Follow the Leader I 


(1,1) 


(-1,-1) 


| (-1,-1) 





(2,2) 
Payoffs to: (Smith,Brown) 


Consider Brown's strategy of (Small, Small) in Equilibrium Z. If Smith deviated 
from equilibrium by choosing Large, it would be unreasonable for Brown to stick 
to the response Small. Instead, he should also choose Large. But if Smith expected 
a response of Large, then he would indeed choose Large, and Z would not be an 
equilibrium. A similar argument shows that (Large, Large) is an irrational strategy 
for Brown, and we are left with Y as the unique equilibrium. 

We say that Equilibria X and Z are Nash equilibria but not “perfect” Nash equi- 
libria. A strategy combination is a perfect equilibrium if it remains an equilibrium 
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on all possible paths, both the equilibrium path and other paths which branch off 
into different “subgames.” 
A subgame is a game consisting of a node which is a singleton in every . 
player’s information partition, that node’s successors, and the payoffs at the 
associated end nodes.! 


A strategy combination is a subgame perfect Nash equilibrium if (a) it 
is a Nash equilibrium for the entire game; and (b) its relevant action rules are 
a Nash equilibrium for every subgame. 


The extensive form of Follow the Leader I in Figure 4.1 (a reprise of Figure 2.1) 
has three subgames: (1) the entire game, (2) the subgame starting at Node B,, and 
(3) the subgame starting at Node B5. Strategy combination X is not a subgame 
perfect equilibrium, because it is only Nash in subgames (1) and (3), not in subgame 
(2). Strategy combination Z is not a subgame perfect equilibrium, because it is only 
Nash in subgames (1) and (2), not in subgame (3). But strategy combination Y is 
Nash in all three subgames. 

One reason why perfectness (the word "subgame" is usually left off) is a good 
concept is because out-of-equilibrium behavior is irrational in a non-perfect equilib- 
rium. A second justification is that a weak Nash equilibrium is not robust to small 
changes in the game. So long as he is certain that Smith will not choose Large, 
Brown is indifferent between the never-to-be-used responses (Small if Large) and 
(Large if Large). Equilibria X, Y, and Z are all weak because of this. But if there is 
even a small probability that Smith will choose Large—perhaps by mistake—then 
Brown would prefer the response (Large if Large), and equilibria X and Z are no 
longer valid. Perfectness is a way to eliminate some of the weak Nash equilibria. We 
call the small probability of a mistake a tremble, and we will return to this trem- 
bling hand approach in Section 5.1 as one way to extend the notion of perfectness 
to games of asymmetric information. 


4.3 Ап Example of Perfectness: Entry Deterrence I 


We turn now to a game in which perfectness plays a role just as important as in 
Follow the Leader I, but the players are in conflict. An old question in industrial 
organization is whether an incumbent monopolist can maintain his position by 
threatening to wage a price war against any new firm that enters the market. 'This 
idea was heavily attacked by Chicago School economists such as McGee (1958) on 
the grounds that a price war would hurt the incumbent more than colluding with 
the entrant. Game theory can present this reasoning very cleanly. Let us consider 
a single episode of possible entry and price warfare, which nobody expects to be 
repeated. 


i Technically, this is a proper subgame because of the information qualifier, but no economist is so 


ill-bred as to use any other kind of subgame. 
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Payoffs 


Market profits are 100 at the monopoly price and 0 at the fighting price. Entry 
costs 10. Collusion shares the profits evenly. 


The strategy sets can be discovered from the order of actions and events. They 
are { Enter, Stay Out} for the entrant, and {Collude if entry occurs, Fight if entry 
occurs} for the incumbent. The game has the two Nash equilibria indicated in 
boldface, (Enter, Collude) and (Stay Out, Fight ). The equilibrium (Stay Out, 
Fight) is weak, because the incumbent would just as soon Collude given that the 
entrant is staying out. 


Table 4.1 Entry Deterrence I 


Incumbent 
Collude Fight 
Enter 40,50 —10,0 


Entrant 
Stay Out 0,100 0,100 


Payoffs to: (Entrant, Incumbent) 


A piece of information has been lost by condensing from the extensive form, 
Figure 4.2, to the normal] form, Table 4.1—the entrant gets to move first. Once 
he has chosen Enter, the incumbent’s best response is Collude. The threat to fight 
is not credible and would be employed only if the incumbent could bind himself 
to fight, in which case he never does fight, because the entrant chooses to stay 
out. The equilibrium (Stay Out. Fight), is Nash but not subgame perfect, because 
if the game is started after the entrant has already entered, the incumbent’s best 


Dynamic Games with Symmetric Information 87 





Figure 4.2 Entry Deterrence I 


(40,50) ~ 


(—10,0) 





(0,100) 


Payoffs to: (Entrant, incumbent) 


response is Collude. This does not prove that collusion is inevitable in duopoly—we 
will analyze many other duopoly models in Chapter 12 and 13—but collusion is the 
equilibrium for Entry Deterrence I. 

The trembling hand interpretation of perfect equilibrium can be used here. So 
long as it is certain that the entrant will not enter, the incumbent is indifferent 
between Fight and Collude, but if there were even a small probability of entry— 
perhaps because of a lapse of good judgement by the entrant—the incumbent would 
prefer Collude and the Nash equilibrium would be broken. 

Perfectness rules out threats that are not credible. Entry Deterrence I provides 
a good example, because if a communication move were added to the game tree, 
the incumbent might tell the entrant that entry would be followed by fighting, but 
the entrant would ignore this non-credible threat. If, however, some means existed 
by which the incumbent could precommit himself to fight entry, the threat would 
become credible. 


Should the Modeller Ever Use Non-Perfect Equilibria? 


A game in which a player can commit himself to a strategy can be modelled in two 
ways: 


(1) Asa game in which non-perfect equilibria are acceptable, or 
(2) By changing the game to replace the action Do X with Commit to Do X at 
an earlier node. 


An example of (2) in Entry Deterrence I is to remodel the game so that the 
incumbent moves first, deciding in advance whether or not to choose Fight before 
the entrant gets to move. Approach (2) is better than (1), because usually the 
modeller wants to let players commit to some actions and not others, and he can 
do this by carefully specifying the order of play. Allowing equilibria to be non- 
perfect forbids such discrimination and usually multiplies the number of equilibria. 
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Indeed, the problem with subgame perfectness is not that it is too-restrictive, 
but that it still allows too many strategy combinations to be equilibria in games 
of asymmetric information. A subgame must start at a single node and not cut 
across any player’s information set, so often the only subgame will be the whole 
game and imposing subgame perfectness does not restrict equilibrium at all. In 
Section 5.1, we will discuss perfect Bayesian equilibrium and other ways to extend 
the perfectness concept to games of asymmetric information. 


4.4 Finitely Repeated Games and the Chainstore Paradox 
The Chainstore Paradox (Selten [1978]) 


Suppose that we repeat Entry Deterrence I 20 times in the context of a chainstore 
that is trying to deter entry into 20 markets where it has outlets. We have seen that 
entry into just one market would not be deterred, but perhaps with 20 markets the 
outcome is different because the chainstore would fight the first entrant to deter 
the next 19. It turns out this is not the case. 

The repeated game is much more complicated than the one-shot game, as we 
call the unrepeated version. A player's action is still to Enter or Stay Out, to 
Fight or Collude, but his strategy is a potentially very complicated rule telling him 
what action to choose depending on what actions both players took in each of the 
previous periods. Even the five-round repeated Prisoner's Dilemma has a strategy 
set for each player with over two billion strategies, and the number of strategy 
combinations is even greater (Sugden [1986], p. 108). 

The obvious way to solve the game is from the beginning, where there is the 
least past history on which to condition a strategy, but that is not the easy way. 
We must follow Kierkegaard, who said, “Life can only be understood backwards, 
but it must be lived forwards." In picking his first action, a player looks ahead to 
its implications for all the future periods, so it is easiest to start by understanding 
the end of a multi-period game, where the future is shortest. 

Consider the situation in which 19 markets have already been invaded (and 
maybe the chainstore fought, maybe not). In the last market, the subgame in 
which the two players find themselves is identical to the one-shot Entry Deterrence 
І, so the entrant will Enter and the chainstore will Collude, regardless of the past 
history of the game. Consider the next-to-last market. The chainstore can gain 
nothing from building a reputation for ferocity, because it is common knowledge 
that he will Collude with the last entrant anyway. So he might as well Collude in 
the 19th market. But we can say the same of the 18th market—and by continuing 
backward induction, of every market, including the first. This result is called the 
Chainstore Paradox. 

Backward induction ensures that the strategy combination is a subgame perfect 
equilibrium. There are other Nash equilibria—(Always Fight, Never Enter), for 
example—but because of the Chainstore Paradox they are not perfect. 


The Repeated Prisoner's Dilemma 
The Prisoner's Dilemma of Section 1.2 is similar to Entry Deterrence I: the prisoners 
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would like to commit themselves to Cooperate, but in the absence of commitment 
they Fink. The Chainstore Paradox can be applied to show that repetition does 
not induce cooperative behavior. Both prisoners know that in the last repetition, 
both will Fink. After 18 repetitions, they know that no matter what happens 
in the 19th, both will Fink in the 20th, so they might as well Fink in the 19th 
too. Building a reputation is pointless, because in the 20th period it is not going 
to matter. Proceeding inductively, both players Fink in every period, the unique 
perfect equilibrium outcome. 

In fact, as a consequence of the fact that the one-shot Prisoner’s Dilemma has a 
dominant strategy equilibrium, finking is the only Nash outcome for the repeated 
Prisoner’s Dilemma, not just the only perfect outcome. The argument of the pre- 
vious paragraph did not show that finking was the unique Nash outcome. To show 
subgame perfectness, we worked back from the end using longer and longer sub- 
games. To show that finking is the only Nash outcome, we do not look at subgames, 
but instead rule out successive classes of strategies from being Nash. Consider the 
portions of the strategy which apply to the equilibrium path (that is, the portions 
directly relevant to the payoffs). No strategy in the class that calis for Cooperate 
in the last period can be a Nash strategy, because the same strategy with Fink 
replacing Cooperate would dominate it. But if both players have strategies calling 
for finking in the last period, then no strategy that does not call for finking in the 
next-to-last period is Nash, because a player should deviate by replacing Cooperate 
with Fink in the next-to-last period. The argument can be carried back to the first 
period, ruling out any class of strategies that does not call for finking everywhere 
along the equilibrium path. 

The strategy of always finking is not a dominant strategy, as it is in the one-shot 
game, because it is not the best response to various suboptimal strategies such as 
(Cooperate until the other player finks, then fink for the rest of the game). Moreover, 
the uniqueness is only on the equilibrium path. Non-perfect Nash strategies could 
call for cooperation at nodes far away from the equilibrium path, since that action 
would never need to be taken. If Row has chosen (Always Fink), one of Column’s 
best responses is (Always fink unless Row has cooperated ten times; then always 
cooperate). 


4.5 Discounting 

A model with many rounds must specify whether payments are valued less if they 
are made at later rounds, i.e., whether they are discounted. Discounting is mea- 
sured by the discount rate or the discount factor. 


The discount rate, r, is the extra fraction of a payoff unit needed to com- 
pensate for delaying receipt by one period. 


The discount factor, 6, is the value in present payoff units of one payoff 
unit to be received one period from the present. 


The discount rate is analogous to the interest rate, and in some models the 
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interest rate determines the discount rate. The discount factor represents exactly 
the same idea as the discount rate, and 6 = 1/(1-- т). Models use т or ó-depending 
on notational convenience. “No discounting” is equivalent to r = 0 and 6 = 1, so 
the notation includes no discounting as a special case. 

Whether to put discounting into a model involves two questions. The first is 
whether the added complexity is accompanied by a change in the results or a 
surprising demonstration of no change in the results. If discounting makes no 
difference, it should not be included. If discounting does make a difference, the 
second question arises: do the events of the model occur in real time (Section 2.2), 
so that discounting is appropriate? The bargaining game Alternating Offers from 
Section 10.3 can be interpreted in two ways. One way, without discounting, is that 
the players make all their offers and counteroffers between dawn and dusk of a 
single day, so essentially no real time has passed. The other way, with discounting, 
is that each offer consumes a week of time, so that the delay before the bargain is 
reached is important to the players. 

Discounting has two important sources: time preference and a probability that 
the game might end, represented by the rate of time preference, p, and the prob- 
ability each period that the game ends, @. It is usually assumed that p and @ are 
constant. If they both take the value zero, the player does not care whether his 
payments are scheduled now or ten years from now. Otherwise, a player is indif- 
ferent between 2/(1 + р) now and т guaranteed to be paid a period later. With 
probability (1 — 8) the game continues and the payment a period later is actually 
made, so the player is indifferent between (1 — 0)2/(1 + р) now and the promise of 
т to be paid a period later contingent upon the game still continuing. The discount 
factor is therefore 


1 (1-0) 


l+r (1-;) 








$ = (4.1) 


Table 4.2 summarizes the implications of discounting for the value of payment 
streams of various kinds. We will not go into how these are derived, but they all 
stem from the basic fact that a dollar paid in the future is worth 6 dollars now. 


Continuous time models usually refer to rates of payment rather than lump sums, 
so the discount factor is not so useful a concept, but discounting works the same 


Table 4.2 Discounting 


Discounted value 
r-notation é-notation 


таб the end of one period E бт 
r at the end of each period up through T = Р ёг 


= 
t=1 (Іт)! 
т at the end of each period in perpetuity т/т zó/(1- 6) 
| rat the start of each period in perpetuity т--т/т z/(1- 6) 
| zr at time f in continuous time те" — 
| a flow of т per period in continuous time up to T 
a flow of z per period in continuous time in perpetuity 
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way as in discrete time except that payments are continuously compounded. For a 
full explanation, see a finance text (e.g. Appendix A of Copeland & Weston [1983]). 


4.6 Infinitely Repeated Games and the Folk Theorem 


The contradiction between the Chainstore Paradox and what many people think 
of as real world behavior has been most successfully resolved by adding incomplete 
information to the model, as we will do in Section 5.3. Before we turn to the tools 
needed to handle incomplete information, however, we will explore certain other 
modifications. One idea is to repeat the Prisoner’s Dilemma an infinite number 
of times instead of a finite number (after all, few economies, except perhaps Hong 
Kong, have a known end date). Without a last period, the inductive argument in 
the Chainstore Paradox fails. In fact, we can find a simple perfect equilibrium for 
the infinitely repeated Prisoner’s Dilemma in which both players cooperate: both 
players adopt the “grim strategy.” 


Grim Strategy 
(1) Start by choosing Cooperate. 


(2) Continue to choose Cooperate unless some other player chooses Fink, 1n 
which case choose Fink forever thereafter. 


If Column uses the grim strategy, the grim strategy is weakly Row’s best re- 
sponse. If Row cooperates, he will continue to receive the high (Cooperate, 
Cooperate) payoff forever. If he finks, he will receive the higher (Fink, Cooperate) 
payoff once, but the best he can hope for thereafter is the (Fink, Fink) payoff. 

Even in the infinitely repeated game, cooperation is not immediate, and not 
every strategy that punishes finking is perfect. A notable example is the strategy 
“tit-for-tat.” 


Tit-for-Tat 

(1) Start by choosing Cooperate. 

(2) Thereafter, in period n choose the action that the other player chose in 

period (n — 1). 

If Column uses tit-for-tat, Row does not have an incentive to Fink first, because 
if Row cooperates he will continue to receive the high (Cooperate, Cooperate) pay- 
off, but if he finks and then returns to tit-for-tat, the players alternate (Fink, 
Cooperate) with (Cooperate, Fink) forever. Row’s average payoff from this alter- 
nation would be lower than if he had stuck to (Cooperate, Cooperate), and would 
swamp the one-time gain. But tit-for-tat is not perfect, because it is not rational 
for Column to punish Row’s initial Fink. Adhering to tit-for-tat’s punishments 
results in a miserable alternation of Finks, so Column would rather ignore Row’s 
first Fink. The deviation is not from the equilibrium path action of Cooperate, 
but from the off-equilibrium action rule of Fink in response to a Fink. Tit-for-tat, 
unlike the grim strategy, cannot enforce cooperation. 
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Unfortunately, although eternal cooperation is a perfect equilibrium outcome 
in the infinite game under at least one strategy, so is practically anything else, 
including eternal finking. The multiplicity of equilibria is summarized by the Folk 
Theorem, so called because no one remembers who should get credit for it. One 
version is stated below, and we will spend the rest of the section explaining it. 


Theorem 4.1 (The Folk Theorem) 


In an infinitely repeated n-person game with finite action sets at each repeti- 
tion, any combination of actions observed in any finite number of repetitions 
is the unique outcome of some subgame perfect equilibrium given 


Condition 1: The rate of time preference is zero, or positive and sufficiently 
small; 


Condition 2: The probability that the game ends at any repetition is zero, 
or positive and sufficiently small; and 


Condition 3 (Dimensionality): The set of payoff combinations that 
strictly Pareto-dominate the minimax payoff combinations in the mized er- 
tension of the one-shot game is n-dimensional. 


What the Folk Theorem tells us is that claiming that particular behaviour arises 
in a perfect equilibrium is meaningless in an infinitely repeated game. This applies 
to any game that meets Conditions 1 to 3, not just the Prisoner’s Dilemma. If an 
infinite amount of time always remains in the game, a way can always be found to 
make one player willing to punish another for the sake of a better future, even if 
the punishment currently hurts the punisher as well as the punished. Any finite 
interval of time is insignificant compared to eternity, so the threat of future reprisal 
makes the players willing to carry out the punishments needed. 

We will next discuss Conditions 1 to 3. 


Discounting 


The Folk Theorem helps answer whether discounting future payments lessens the 
influence of the troublesome Last Period. Quite the opposite is true. With dis- 
counting, the present gain from finking is weighted more heavily and future gains 
from cooperation more lightly. If the discount rate is very high the game almost 
returns to being one-shot. When the real interest rate is 1,000 percent, a payment 
next year is little better than a payment a hundred years hence, so next year is 
practically irrelevant. Any model that relies on a large number of repetitions also 
relies on the discount rate not being too high. 

Allowing a little discounting is nonetheless important to show there is no discon- 
tinuity at the discount rate of zero. If we come across an undiscounted infinitely 
repeated game with many equilibria, the Folk Theorem tells us that adding a small 
discount rate will not reduce the number of equilibria. This contrasts with the 


Dynamic Games with Symmetric Information 93 





effect of changing the model by making the number of repetitions large but finite, 
which often eliminates all but one outcome by inducing the Chainstore Paradox. - 
A discount rate of zero supports many perfect equilibria, but if the rate is large 


enough, the only equilibrium outcome is eternal finking. We can calculate the 


critical value for given parameters. The grim strategy imposes the heaviest possible 
punishment for deviant behavior. Using the payoffs for the Prisoner's Dilemma from 
Table 4.3a in the next section, the equilibrium payoff from the grim strategy is the 
current payoff of 5 plus the value of the rest of the game, which from Table 4.2 is 
9/r. If Row deviated by finking, he would receive a current payoff of 10, but the 
value of the rest of the game would fall to 0. The critical value of the discount rate 
is found by solving the equation 5+ 5/r = 10 + 0, which yields т = 1, a discount 
rate of 100 percent or a discount factor of 6 = 0.5. Unless the players are extremely 
impatient, hinking is not much of a temptation. 


Random Ending 


Time preference is fairly straightforward, but what is surprising is that assuming 
that the game ends each period with probability 0 does not make a drastic difference. 
In fact, we could even allow 6 to vary over time, so long as it never became too 
large. If Ө > 0, the game ends with probability one; or, put less dramatically, 
the expected number of repetitions is finite, but it still behaves like a discounted 
infinite game, because the expected number of future repetitions is always large, no 
matter how many have already occurred. The game still has no Last Period, and 
it is still true that imposing one, no matter how far beyond the expected number 
of repetitions, would radically change the results. 

I find it interesting that “(a) the game will end at some uncertain date before 
Т,” is different from “(b) a constant probability of the game ending." Under (а), 
the game is like a finite game, because as time passes the maximum amount of time 
still to run shrinks to zero. Under (b), even though the game will probably end by 
T, if it lasts until Т the game looks exactly the same as at time zero. 


The Dimensionality Condition 

The “minimax payoff” mentioned in Theorem 4.1 is the payoff that results if all the 
other players pick strategies solely to punish player i, and he protects himself as 
best he can (see note N4.6). The dimensionality condition is needed only for games 
with three or more players. It is satisfied if for each player there is some payoff 
combination in which his payoff is greater than his minimax payoff but different 
from the payoff of every other player. This is satisfied by the n-person Prisoner's 
Dilemma in which a solitary finker gets a higher payoff than his cooperating fellow- 
prisoners. It is not satisfied by Pure Coordination, in which all the players have the 
same payoff. The condition is necessary because establishing the desired behavior 
requires some way for the other players to punish a deviator without punishing 
themselves. 


Precommitment 


What if we use metastrategies, abandoning the idea of perfectness by allowing 
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players to commit at the start to a strategy for the rest of the game? We would 
still want to keep the game noncooperative by disallowing binding promises, but 
we could model it as a game with simultaneous choices by both players, or with 
one move each in sequence. 

If precommitted strategies are chosen simultaneously, the equilibrium outcome of 
the finitely repeated Prisoner’s Dilemma calls for always finking, because allowing 
commitment is the same as allowing equilibria to be non-perfect, in which case, as 
was shown earlier, the unique Nash outcome is always finking. 

A different result is achieved if the players precommit to strategies in sequence. 
The outcome depends on the particular values of the parameters, but one possible 
equilibrium is the following: Row moves first and chooses the strategy (Cooperate 
until Column Finks; thereafter always Fink), and Column chooses (Cooperate until 
the last period; then Fink). The observed outcome would be for both players to 
cooperate until the last period, and then for Row to again cooperate, but be finked 
by Column. Row would submit to this because if he chose a strategy that initiated 
finking earlier, Column would choose a strategy of starting to fink earlier too. The 
game has a second-mover advantage. 


4.7 Reputation: the One-Sided Prisoner’s Dilemma 


In Part II of this book we will look at moral hazard and adverse selection. Under 
moral hazard, a player wants to commit to high effort, but he cannot credibly do so. 
Under adverse selection, a player wants to prove he is high ability, but he cannot. 
In both, the problem is that the penalties for lying are insufficient. Reputation 
seems to offer a way out of the problem. If the relationship is repeated, perhaps a 
player is willing to be honest in early periods in order to establish a reputation for 
honesty valuable to himself later. 

Reputation seems to play a similar role in making threats to punish credible. 
Usually punishment is costly to the punisher as well as the punished, and it is not 
clear why the punisher should not let bygones be bygones. Yet in 1988 we see the 
Soviet Union paying off 70-year-old debt to dissuade the Swiss authorities from 
blocking a mutually beneficial new bond issue (“Soviets Agree to Pay Off Czarist 
Debt to Switzerland,” Wall Street Journal, 19 January 1988, p. 60). Why were the 
Swiss so vindictive towards Lenin? 

The questions of why players do punish and do not cheat are really the same 
questions that arise in the repeated Prisoner’s Dilemma, where only an infinite 
number of repetitions allows cooperation. That is the great problem of reputation: 
since everyone knows that a player will Fink, choose low effort, or default on debt 
in the last period, why do they suppose he will bother to build up a reputation in 
the present? Why should past behaviour be any guide to future behaviour? 

To show how reputation works in a given situation, we must show why the 
Chainstore Paradox does not apply. The two approaches are to show that 


(1) There is cooperation in the last period, or 
(2) The early periods are different from the last period. 


Not all reputation problems fit the basic Prisoner's Dilemma. Some, such as 





Dynamic Games with Symmetric Information 95 


duopoly or the original Prisoner’s Dilemma, are two-sided in the sense that each 
player has the same strategy set and the payoffs are symmetric. Others, such 
as product quality, are what we might call one-sided Prisoner’s Dilemmas, 
which have properties similar to the Prisoner’s Dilemma but do not fit the usual. 
definition because they are asymmetric. Table 4.3 shows the normal forms for 
both the original Prisoner’s Dilemma and the one-sided version. The important 
difference is that in the one-sided Prisoner’s Dilemma at least one player really 
does prefer (Cooperate, Cooperate) to anything else. He finks defensively, rather 
than both offensively and defensively. The payoff (0,0) can often be interpreted as 
the refusal of one player to interact with the other: for example, the motorist who 
refuses to buy cars from Chrysler because he knows they once falsified odometers. 
Table 4.4 lists examples of both one-sided and two-sided games. Versions of the 
Prisoner’s Dilemma with three or more players can also be classified as two-sided 
or one-sided, depending on whether all players find Fink a dominant strategy or 
not. 


Table 4.3 Prisoner’s Dilemmas 


(a) Two-sided 
Column 
Cooperate Fink 
Cooperate 5,5 —5, 10 
Row 
Fink 10, —5 0,0 


Payoffs to: (Row, Column) 
(b) One-sided 


Consumer 


|. Buy Boycott 
(Cooperate) (Fink) 
Honest ( Cooperate) 5,5 —5,10 
Seller 
Cheat (Fink) 10, —5 0,0 
Payoffs to: (Seller, Consumer) 


The Nash and iterated dominant strategy equilibria in the one-sided Prisoner's 
Dilemma are still (Fink, Fink), but it is not a dominant strategy equilibrium. Col- 
umn does not have a dominant strategy, because if Row were to choose Cooperate, 
Column would also choose Cooperate, to obtain the payoff of five; but if Row 
chooses Fink, Column would choose Fink and obtain the payoff of zero. Fink is 
weakly dominant for Row, which makes (Fink, Fink) the iterated dominant strat- 
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Table 4.4 Repeated games in which reputation is important 


Application 


Prisoner’s Dilemma two-sided Cooperate /Fink 
Cooperate /Fink 
Duopoly two-sided i Maintain price/Drop price 
Fi Maintain price/Drop price | 
| Employer-Worker two-sided Work Work/Slack off | 
Bonus/No bonus 
Product Quality one-sided Fi High quality/Low quality 
| High price/Low price 
Entry Deterrence one-sided n Р< МС / Р= МС? 
Not enter/Enter 
Financial Disclosure one-sided i Tell truth/Lie 
3 High price/Low price 
Borrowing one-sided Repay/Default 
Lend/Do not lend 





egy equilibrium. In both games, the players would like to persuade each other that 
they will cooperate, and devices that induce cooperation in the one-sided game will 
usually obtain the same result in the two-sided game. 


4.8 Product Quality in an Infinitely Repeated Game 


The Folk Theorem tells us that some perfect equilibrium of an infinitely repeated 
game-——sometimes called an infinite horizon model—can generate any pattern of 
behavior observed over a finite number of periods. But since the Folk Theorem is no 
more than a mathematical result, the strategies that generate particular patterns of 
behavior may be unreasonable. The theorem’s value is in provoking close scrutiny 
of infinite horizon models so that the modeller must show why his equilibrium is 
better than the multitude of others. He must go beyond satisfaction of the technical 
criterion of perfectness and justify the strategies on other grounds. 

In the simplest model of product quality, a seller can choose between producing 
costly high quality or costless low quality, and the buyer cannot determine quality 
before he purchases. If the seller would produce high quality under symmetric 
information, we have a one-sided Prisoner’s Dilemma, as in Table 4.3b. Both players 
are better off when the seller produces high quality and the buyer purchases the 
product, but the seller’s weakly dominant strategy is to produce low quality, so 
the buyer will not purchase. This is also an example of moral hazard, the topic of 
Chapter 6. 

A potential solution is to repeat the game, allowing the firm to choose quality at 
each repetition. If the number of repetitions is finite, however, the outcome stays 
the same because of the Chainstore Paradox. In the last repetition, the subgame is 
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identical to the one-shot game, so the firm chooses low quality. In the next-to-last 
repetition, it is foreseen that the last period’s outcome is independent of current 
actions, so the firm also chooses low quality, an argument that can be carried back 
to the first repetition. 

If the game is repeated an infinite number of times, the Chainstore Paradox 
is inapplicable and the Folk Theorem says that a wide range of outcomes can be 
observed in equilibrium. Klein & Leffler (1981) construct a plausible equilibrium 
for an infinite period model. Their original article, in the traditional verbal style of 
UCLA, does not phrase the result in terms of game theory, but we will recast it here. 
In equilibrium, the firm is willing to produce a high quality product because it can 
sell at a high price for many periods, but consumers refuse to ever buy again from 
a firm that has once produced low quality. The equilibrium price is high enough 
that the firm is unwilling to sacrifice its future profits for a one-time windfall from 
deceitfully producing low quality and selling it at a high price. Although this is 
only one of a large number of subgame perfect equilibria, the consumers’ behavior 
is simple and rational: no consumer can benefit by deviating from the equilibrium. 


Product Quality 


Piyer. 
Ал: infinite number: of өтеле firms and a continuum of conspmness. 


Information Ege a et 
Asymmetric, complete, and certain. | е INE Дь 3 


ie 


Actions and Events | UE 







TA TI 92% сын жы МЕ К и ЖК х 
5 = E A Rose esa em POSES йн. mes 
des n E 1 à d 
= SEA А 


d ‘= = Li 
, um Pe ле, 
= "Ға че" 
: 
. 
EI. a "ue - фа 
pmi t E DEA ~ З жт АИЫ, SW. 


SYM "Xn «re 


167 


168 


98 Game Theory 
у ee 

That the firm can produce low-quality items at zero marginal cost is unrealistic, 
but it is only a simplifying assumption. By normalizing the cost of producing low 
quality to zero, we avoid having to carry an extra variable through the analysis 
without affecting the result. 

The Folk Theorem tells us that this game has a wide range of perfect outcomes, 
including a large number with erratic quality patterns like (High, High, Low, High, 
Low, Low. ..). If we confine ourselves to pure strategy equilibria with the stationary 
outcome of constant quality and identical behavior by all firms in the market, 
then the two outcomes are low quality and high quality. Low quality is always an 
equilibrium outcome, since it is an equilibrium of the one-shot game. If the discount 
rate is low enough, high quality is also an equilibrium outcome, and this will be the 
focus of our attention. Consider the following strategy combination: 


Firms. 7 firms enter. Each produces high quality and sells at price p. If a firm 
ever deviates from this, it thereafter produces low quality and sells at price p. The 
values of р and п are given by equations (4.3) and (4.7) below. 


Buyers. Buyers start by choosing randomly among the firms charging р. There- 
after, they remain with their initial firm unless it changes its price or quality, in 
which case they switch randomly to a firm that has not changed its price or quality. 


This strategy combination is a perfect equilibrium. Each firm is willing to pro- 
duce high quality and refrain from price-cutting because otherwise it would lose all 
its customers. If it has deviated, it is willing to produce low quality because the 
quality is unimportant, given the absence of customers. Buyers stay away from a 
firm that has produced low quality because they know it will continue to do so, 
and they stay away from a firm that has cut the price because they know it will 
produce low quality. For this story to work, however, the equilibrium must satisfy 
three constraints that will be explained in more depth in Section 6.3: incentive 
compatibility, competition, and market clearing. 

The incentive compatibility constraint says that the individual firm must be 
willing to produce high quality. Given the buyers' strategy, if the firm ever produces 
low quality it receives a one-time windfall profit, but loses its future profits. The 
tradeoff is represented by constraint (4.2), which is satisfied if the discount rate is 
low enough. 


(Incentive Compatibility) qip < е г c) . (4.2) 
Inequality (4.2) determines а lower bound for the price, which must satisfy 
(4.3) 


We could write (4.3) as an equality rather than an inequality because any firm 
trying to charge a price higher than the quality-guaranteeing р would lose all its 
customers and receive a payoff of —F. 


| 
| 
i 
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Ihe second constraint is that competition drives profits to zero, so firms are 

indifferent between entering and staying out of the market. _ 
(Competition) neoa = Е. 


Treating (4.3) as an equation and using it to replace p in equation (4.4), we obtain 


Air) 


- (4.5) 


qi 
We have now determined p and 4;, and only n remains, which is determined by 
the equality of supply and demand. The market does not always clear in models of 
asymmetric information (see Stiglitz [1987]), and in this model each firm would like 
to sell more than its equilibrium output at the equilibrium price, but the market 
output must equal the quantity demanded by the market. 


(Market Clearing) па: = q(p). (4.6) 
Combining equations (4.3), (4.5), and (4.6) yields 


. zs) 
"= Ра) i 
We have now determined the equilibrium values, the only difficulty being the stan- 
dard existence problem caused by the requirement that the number of firms be an 
integer (see note N4.8). 

The equilibrium price is fixed because F is exogenous and demand is not perfectly 
inelastic, which pins down the size of firms. If there were no entry cost, but demand 
were still elastic, then the equilibrium price would still be the unique p that satisfied 
constraint (4.3), and the market quantity would be determined by а(р), but F and а; 
would be undetermined. If consumers believed that any firm which might possibly 
produce high quality paid an exogenous dissipation cost F, the result would be a 
continuum of equilibria. The firms’ best response would be for Л of them to pay 
К and produce high quality at price p, where ñ is determined by the zero profit 
condition as a function of F. Klein & Leffler note this indeterminacy and suggest 
that the profits might be dissipated by some sort of brand-specific capital. The 
history of the industry may also explain the number of firms. Schmalensee (1982) 
shows how a pioneering brand can retain a large market share because consumers 
are unwilling to investigate the quality of new brands. 


Recommended Reading 


Fudenberg, Drew & Eric Maskin (1986) "The Folk Theorem in Repeated Games 
with Discounting or with Incomplete Information" Econometrica. May 1986. 54, 
3: 533-54. 

Klein, Benjamin & Keith Leffler (1981) "The Role of Market Forces in Assuring 
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Selten, Reinhard (1978) “The Chain-Store Paradox” Theory and Decision. April 
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Problem 4 
Repeated Entry Deterrence 


Consider two repetitions without discounting of the game Entry Deterrence I from 
Section 4.3. Assume that there is one entrant, who sequentially decides whether to 
enter two markets that have the same incumbent. 


(1) Draw the extensive form of this game. 

(2) What are the strategy sets for the entrant and the incumbent? 
(3) What is the subgame perfect equilibrium? 

(4) What is one of the non-perfect Nash equilibria? 


Notes 
N4.2 Subgame Perfectness 


Often "perfectness" is called "perfection."  "Perfectness" is the term used in Selten 
(1975), and conveys an impression of completeness more appropriate to the concept 
than the goodness implied by "perfection". 

Perfectness is not the only way to eliminate weak Nash equilibria like (Stay Out, Col. 
lude). In Entry Deterrence I, (Enter, Collude) is the only iterated dominant strategy 
equilibrium, because Fight is weakly dominated for the incumbent. 

The distinction between perfect and non-perfect Nash equilibria is like the distinction 
between closed loop and open loop trajectories in dynamic programming. Closed 
loop (or feedback) trajectories can be revised after they start, like perfect equilibrium 
strategies, while open loop trajectories are completely prespecified (though able to de- 
pend on state variables). In dynamic programming the distinction is not so important, 
because prespecified strategies do not change the behavior of other players. No threat, 
for example, is going to alter the pull of the moon's gravity on a rocket. 

A subgame can be infinite in length, and infinite games can have non-perfect equilibria. 
The infinitely repeated Prisoner's Dilemma is an example, in which every subgame looks 
exactly like the original game, but begins at a different point in time. 

Perfectness in Macroeconomics. In macroeconomics the requirement of dynamic 
consistency or time consistency is similar to perfectness. These terms are less pre- 
cisely defined than perfectness, but they usually require only that strategies be best 
responses in subgames starting from nodes on the equilibrium path, instead of all sub- 
games. Under this interpretation, time consistency is a less stringent condition than 
perfectness. 

The Federal Reserve would like to induce inflation to stimulate the economy, but the 
economy is stimulated only if the inflation is unexpected. If the inflation is expected, 
its effects are purely bad. Since members of the public know that the Fed would like 
to fool them, they disbelieve its claims that it will not generate inflation (see Kydland 
& Prescott [1977]). Likewise, the government would like to issue nominal debt, and 
promises lenders that it will keep inflation low, but once the debt is issued, the gov- 
ernment has incentive to inflate its real value to zero. One reason that the Federal 
Reserve Board was established to be independent of Congress in the United States was 


to diminish this problem. 
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In many situations, irrationality—behavior that is automatic rather than strategic—is 
an advantage. The Doomsday Machine of the movie Dr Strangelove is one example: it 
blows up the world if anyone explodes a nuclear bomb (although, as Dr Strangelove 
says, it is worse than useless if no one tells the bomb-owners about the machine). 
President Nixon reportedly told his aide H.R. Haldeman about a more complicated 
version of this strategy: “I call it the Madman Theory, Bob. I want the North Viet- 
namese to believe that I’ve reached the point where I might do anything to stop the 
war. We'll just slip the word to them that ‘for God's sake, you know Nixon is obsessed 
about Communism. We can't restrain him when he's angry—and he has his hand on 
the nuclear button—and Ho Chi Minh himself will be in Paris in two days begging for 
peace,"(Haldeman & DiMona [1978] p. 83). The Gang of Four model in Section 5.4 
tries to model a situation like this. 
The “lock-up agreement’ (see Macey & McChesney [1985] p. 33) is an example of a 
credible threat: in a takeover defense, the threat to destroy the firm is made legally 
binding. 
Bernheim, Peleg, & Whinston (1987) and Bernheim & Whinston (1987) introduce the 
equilibrium concept of coalition-proof Nash equilibrium. In this refinement of Nash, 
a strategy combination is an equilibrium only if no coalition of players could form a 
self-enforcing agreement to deviate from it. The concept is easily extended to include 
perfectness. 


N4.3 An Example of Perfectness: Entry Deterrence I 


|] 


The Stackelberg equilibrium of а duopoly game (Section 3.4) can be viewed as the perfect 
equilibrium of a Cournot Game modified so that one player moves first, a game similar 
to Entry Deterrence I. The player moving first is the Stackelberg leader and the player 
moving second is the Stackelberg follower. The follower could threaten to produce a 
large output, but he will not carry out his threat if the leader produces a large output 
first. 

Perfectness is not so desirable a property of equilibrium in biological games. The reason 
the order of moves matters is because the rational best reply depends on the node at 
which the game has arrived. In many biological games the players act by instinct and 
unthinking behavior is not unrealistic (see Section 5.6). 

Reinganum & Stokey (1985) is a clear presentation of the implications of perfectness 
and commitment illustrated with the example of natural resource extraction. 


Маа + Finitely Repeated Games and the Chainstore Paradox 


The Chainstore Paradox does not apply to all games as neatly as to Entry Deterrence 
and the Prisoner’s Dilemma. If the one-shot game has only one Nash equilibrium, the 
perfect equilibrium of the finitely repeated game is unique and has that same outcome. 
But if the one-shot game has multiple Nash equilibria, the perfect equilibrium of the 
finitely repeated game can have not only the one-shot outcomes, but others besides. See 
Benoit & Krishna (1985), Harrington (1987), and Moreaux (1985). 

The quotation is attributed to Soren Kierkegaard’s Life in John Bartlett, Familiar Quo- 
tations, 14th edition, Boston: Little, Brown, and Co., 1968, p. 676. Bartlett's reference 
is vague and should be treated skeptically. 

The peculiarity of the unique Nash equilibrium for the Repeated Prisoner's Dilemma 
was noticed long before Selten (1978) (see Luce & Raiffa [1957] p. 99), but the term 
"Chainstore Paradox" is now generally used for all unravelling games of this kind. 

An epsilon-equilibrium is a strategy combination s" such that no player has more than an є 
incentive to deviate from his strategy given that the other players do not deviate. Formally, 


Vi, л4(8;,4:,) > tilsi sL) E Уз; Е Si. (4.8) 


Radner (1980) has shown that cooperation can arise as an e-equilibrium of the finitely 
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repeated Prisoner’s Dilemma. Fudenberg & Levine (1986) compare the –едш па of 
finite games with the Nash equilibria of infinite games. Other concepts-besides Nash 
can also use the ¢-equilibrium idea. 

A general way to decide whether a mathematical result is a trick of infinity is to see if 
the same result is obtained as the limit of results for longer and longer finite models. 
Applied to games, a good criterion for picking among equilibria of an infinite game is to 
select one which is the limit of the equilibria for finite games as the number of periods 
gets longer. Fudenberg & Levine (1986) show under what conditions one can find the 
equilibria of infinite horizon games by this process. For the Prisoner's Dilemma, ( Always 
Fink) is the only equilibrium in all finite games, so it uniquely satisfies the criterion. 


A Markov strategy is a strategy that at each node chooses the action independently of the history 
of the game ezcept for the immediately preceding action (or actions, if they were simultaneous) and 
except as the history determines the action set. 


Markov strategies are memoryless. If the players in an undiscounted, infinitely repeated 
Prisoner's Dilemma are restricted to Markov strategies, the strategy set is reduced to 
the eight strategies of the form (Begin with X; pick Y if the other player finked, pick Z if the 
other player cooperated). If we consider only Markov pure strategies, the unique iterated 
dominant strategy equilibrium is for both players to choose tit-for-tat (Aumann [1981]). 
See Section 12.4 for another application of Markov strategies, to customers who have 
switching costs. 

There are two ways to use Markov strategies: (1) just look for equilibria that use Markov 
strategies, and (2) disallow non-Markov strategies and then look for equilibria. Because 
the first way does not disallow non-Markov strategies, the equilibrium must be such that 
no player wants to deviate by using any other strategy, Markov or aot. This is just a 
way of eliminating possible multiple equilibria by discarding ones that use non-Markov 
strategies. The second way is much more dubious, because it requires the players not 
to use non-Markov strategies, even if they are best responses. 

Defining payoffs in games that last an infinite number of periods presents the problem 
that the total payoff is infinite for any positive payment per period. Ways to distinguish 
one infinite amount from another include 


(1) Use an overtaking criterion. Payoff stream т is preferred to 7 if there is some 
time T" such that for every T > T^, 


T T 
У om > E 5%. 
Езе1 


#=1 


(2) Specify that the discount rate is strictly positive, and use the present value. Since 
payments in faroff periods count for less, the discounted value is finite unless the 
payments are growing faster than the discount rate. 


(3) Use the average payment per period, a tricky method since some sort of limit 
needs to be taken as the number of periods averaged goes to infinity. 


Whatever the approach, game theorists assume that the payoff function is additively 
separable over time, which means that the total payoff is based on the sum or aver- 
age, possibly discounted, of the one-shot payoffs. Macroeconomists worry about this 
assumption, which rules out, for example, a player whose payoff is very low if any of 
his one-shot payoffs dips below a certain subsistence level. The issue of separability will 
arise again in Section 12.5 when we discuss durable monopoly. 

Ending in finite time with probability one means that the limit of the probability the 
game has ended by date t goes to one as t goes to infinity; the probability that the game 


lasts till infinity is zero. Equivalently, the expectation of the end date is finite, which it 


could not be were there a positive probability of an infinite length. 
À realistic expansion of a game's strategy may eliminate the Chainstore Paradox. Hirsh- 
leifer & Rasmusen (forth), for example, show that allowing the players in a multi-person, 
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finitely repeated Prisoner's Dilemma to ostracize offenders can enforce cooperation even 
if there are economies of scale in the number of players who cooperate and are not 
ostracized. 


N4.6 Infinitely Repeated Games and the Folk Theorem 


References on the Folk Theorem include Aumann (1981), Fudenberg & Maskin (1986), 
and Rasmusen (unpub). The most commonly cited version of the Folk Theorem says 
that if Conditions 1 to 3 are satisfied, then 


Any payoff combination that strictly Pareto-dominates the minimaz payoff combinations in the mized 
extension of an n-person one-shot game with finite action sets is the average payoff in some perfect 
equilibrium of the infinitely repeated game. 

Trigger strategies are an important kind of strategy for repeated games. Consider 
the oligopolist facing uncertain demand (as in Stigler [1964]). He cannot tell whether 
the low demand he observes facing him is due to Nature or price cutting by his fellow 
oligopolists. Two things that could trigger him to cut his own price in retaliation are 
a series of periods with low demand or one period of especially low demand. Finding 
an optimal trigger strategy is a difficult problem (see Porter [1983a]). Trigger strategies 
are usually not subgame perfect unless the game is infinitely repeated, in which case 
they are a subset of the equilibrium strategies. 


Empirical work on trigger strategies includes Porter (1983b), who examines price wars | 


between railroads in the 19th century, and Slade (1987) who concluded that price wars 
among gas stations in Vancouver used small punishments for small deviations rather 
than big punishments for big deviations. 

The reason why games with a constant probability of ending are like infinite games is 
nicely pointed out by a verse from Amazing Grace: 


When we've been there ten thousand years, 
Bright shining as the sun, 

We've no less days to sing God's praise 
Than when we'd first begun. 


A macroeconomist's technical note related to the similarity of constant-ending games 
and infinite games is Blanchard (1979), which discusses speculative bubbles. 

In the repeated Prisoner's Dilemma, if the end date is infinite with positive probability 
and only one player knows it, cooperation is possible for reasons explained in Section 
5.3. 

An equilibrium concept quite different from Nash or dominance is based on maximin 
strategies. 


The strategy з; is a maximin strategy for player i if, given that the other players pick strategies 
to make his payoff as low as possible, 5: gives him the highest possible payoff. In our notation,s; 
solves 


Marimize Minimum 


= м (4,8—4). (4.9) 


The maximin strategy need not be unique, and it can be in mixed strategies. Since 
maximin behavior can also be viewed as minimizing the maximum loss that might be 
suffered, decision theorists refer to such a policy as a minimax criterion, a catchier 
phrase (Luce & Raiffa [1957] p. 279). 

Maximin strategies have very little justification. They are not simply the optimal 
strategies for risk-averse players, because risk aversion is accounted for in the utility 
payoffs. The players’ implicit beliefs can be inconsistent in a maximin equilibrium, and 
a player must believe that his opponent would choose the most harmful strategy out of 
spite rather than self-interest. 

In the same vein, the minimax strategy has also been defined (the term is used 
differently by game theorists and decision theorists). 
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The strategy s", is a set ој (n — 1) minimax strategies chosen by all the players except i to 


keep i's payoff as low as possible, no matter how he responds. In our notation,s**; solves 


1 


Minimize Marimum 
"У РА 


Tilsi, 8-4). (4.10) 


As an example, consider а two player game with player 1 ав 1. 


Maximin : Marimize Minimum 

а; 53 Жі. 
Minimax: Minimize Marimtum 

За 331 Tu в 


In the Prisoner's Dilemma, the minimax and maximin strategies are both Fink. Al- 
though the Welfare Game (Table 3.1) has only a mixed strategy Nash equilibrium, if we 
restrict ourselves to the pure strategies the pauper's maximin strategy is Try to Work, 
which guarantees him at least 1, and his strategy for minimaxing the government is. Be 
Idle, which prevents the government from getting more than 0. 

Under minimax, player 2 is purely malicious but must move first (at least in choos- 
ing a mixing probability) in his attempt to cause player 1 the maximum pain. Under 
maximin, player 1 moves first, in the belief that player 2 is out to get him. In non-zero- 
sum games, minimax is for sadists and maximin for paranoids. In zero-sum games, the 
players are merely neurotic: minimax is for optimists and maximin for pessimists. 

The Minimax Theorem (von Neumann [1928]) says that a minimax equilibrium exists 
in pure or mixed strategies for every two-person zero-sum game, and is identical to the 
maximin equilibrium. This theorem, while famous, is not useful in economics. 

An alternative to Condition 3 (dimensionality) in the Folk Theorem is 


Condition 3': The repeated game has a "desirable" subgame-perfect equilibrium in which the 
strategy combination 5 played each period gives player i a payoff that exceeds his payoff from some 
other "punishment" subgame-perfect equilibrium in which the strategy combination s* is played each 

Js : Vi, 3s* : v, (s*) < 2; (3). (4.11) 
Condition 3' is useful because sometimes it is easy to find a few perfect equilibria. To 
enforce the desired pattern of behavior, use the "desirable" equilibrium as a carrot and 
the “punishment” equilibrium as a self-enforcing stick. (See Rasmusen [unpub].) 


Any Nash equilibrium of the one-shot game is also a perfect equilibrium of the finitely 
or infinitely repeated game. 


N4.7 Reputation: the One-Sided Prisoner’s Dilemma | 


A game that is repeated an infinite number of times without discounting is called a supergame. 


There is no connection between the terms "supergame" and “subgame.” 

The terms “one-sided” and “two-sided” Prisoner' Dilemma are new with this book. 
Only the two-sided version is a true Prisoner's Dilemma according to the definition of 
note N1.2. 

Empirical work on reputation is scarce. One worthwhile effort is Jarrell & Peltzman 
(1985), which finds that product recalls inflict costs greatly in excess of the measurable 
direct costs of the operations. The sociologist Macaulay (1963) is much cited and little 
imitated. He notes that reputation seems to be more important than the written details 
of business contracts. 

Vengeance and Gratitude. Most models have excluded these feelings (although see 
J. Hirshleifer [1987]), which can be modelled in two ways: 


(1) А player's current utility from Fink or Cooperate depends on what the other 
player has played in the past; or 
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(2) A player's current utility depends on current actions and the other players’ current 
utility in a way that changes with past actions of the other player. 


The two approaches are subtly different in interpretation. In (1), the joy of revenge is 
in the action of finking. In (2), the joy of revenge is in the discomfiture of the other 
player. Especially if the players have different payoff functions, these two approaches 
can lead to different results. 


N4.8 Product Quality in an Infinitely Repeated Game 


The product quality game may also be viewed as a principal agent model of moral 
hazard (see Chapter 6). The seller (an agent), takes the action of choosing quality that 
is unobserved by the the buyer (the principal), but which affects the principal's payoff, 
an interpretation used in much of the Stiglitz (1987) survey of the links between quality 
and price. 

The intuition behind the Klein & Leffler model is similar to the explanation for high 
wages in the Shapiro & Stiglitz (1984) model of involuntary unemployment (Section 
7.3). Consumers, seeing a low price, realize that with a price that low the firm cannot 
resist lowering quality to make short-term profits. A large margin of profit is needed for 
the firm to decide to continue to produce high quality. 

A paper related to Klein & Leffler (1981) is Shapiro (1983), which reconciles a high 
price with free entry by requiring that firms price under cost during the early periods 
to build up a reputation. If consumers believe, for example, that any firm charging 
a high price for any of the first five periods has produced a low quality product, but 
any firm charging a high price thereafter has produced high quality, then firms behave 
accordingly and the beliefs are confirmed. That the beliefs are self-confirming does not 
make them irrational; it only means that many different beliefs are rational in the many 
different equilibria. 

An equilibrium exists in the product quality model only if the entry cost F is just the 
right size to make n an integer in equation (4.7). Any of the usual assumptions to 
get around the integer problem could be used: allowing potential sellers to randomize 
between entering and staying out; assuming that for historical reasons n firms have 
already entered; or assuming that firms lie on a continuum and the fixed cost is a 
uniform density across firms that have entered. 
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AN R&D RACE 


The purpose of this case is to explore some strategic aspects of research and development (R&D) 
races between firms. 


R&D Intensity 


Two firms, A and B, have the basic know-how to develop a process of substantial commercial value, 
estimated to be approximately $10 million. The first firm to succeed in developing it will be able to 
obtain a patent and reap all the benefits from commercialization of the process. The development of the 
overall process will require three stages which must be completed sequentially. Thus, development of 
stage II cannot proceed until a firm has successfully completed development of stage 1 and development 
of stage III cannot begin until stages I and II have both been completed. A patent can be granted only 
for the whole process and not for individual stages. 


A firm must decide on the inzensity of its R&D effort and an increase in the intensity of effort 
shortens the likely time until the current stage is completed. 


If a firm chooses a particular R&D intensity level (say 1), the probability that it will succeed before 
a given amount of time is given by an exponential distribution, as depicted in Figure 1. For instance, 
the probability that the stage it is currently working on will be completed in less than a year is about 
0.65. 





This case was prepared by Professors Vijay Krishna and Adam Brandenburger as the basis for class discussion 


rather than to illustrate either effective or ineffective handling of an administrative situation. 


Copyright © 1990 by the President and Fellows of Harvard College. To order copies, call (617) 495-6117 or write the 
Publishing Division, Harvard Business School, Boston, MA 02163. No part of this publication may be reproduced, stored in 
a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopying, recording, or 


otherwise—without the prior permission of the Harvard Business School. 
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Probab! ity 





The probability of success at any time does not depend on the cumulative effort expended to date 
on the project but only on the current intensity of effort. 


By choosing a higher intensity level, say 2, the firm is more likely to succeed earlier. Figure 2 
depicts the liklihoods of success with R&D intensities of 1 and 2 respectively. Notice that with a 
doubling of R&D intensity from 1 to 2, the chance that the firm will succeed in less than a year has risen 
to almost 0.9. 


Probab! i tty 








IThis fact is a consequence of the assumption that the distribution of time to success is exponential. 
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On the other hand, a more intense R&D effort also leads to greater R&D costs. If firm A were 
to choose an intensity level of a and maintained the effort for a year, the associated costs would be $a? 
million. 


The Race 


If firm A decides to conduct its R&D at a level a and firm B decides on a level b, the probability 
that A succeeds in completing the current stage before B is a/(a--b).? As an example, suppose that firm 
A has yet to complete stage I and is working on the problem with an intensity level of 3, whereas firm 
B has already completed stage I and is working on stage II with an intensity of 2. The probability that 
firm A will complete stage I before firm B can complete stage II is three-fifths (= 3/(3+2)), and the 
probability that firm B will finish stage II before firm A can complete stage I is two-fifths. 


The expected time before one or the other firm has a success, that is, the duration of the current 
race, is 1/(а+5) years.* In the example given above, in one-fifth of a year, it is expected that one firm 
or the other will make progress—either firm A will complete stage I or firm B will complete stage II. 


News concerning R&D successes travels quickly in industry circles. If either firm were to complete 
a particular stage, news of this development would become known to its competitor. 


The Final Stage 


To understand further how the race is played out, consider the situation in which both firms have 
succeeded in completing stages I and II and are now in the last leg of the race to complete stage III. 
Suppose the two firms decide to carry out R&D at intensities a and b respectively. The expected value 
(in $ million) to firm A can be written as: 


Value of completing the current stage X Probability of winning — Expected cost of R&D effort 

= 10 x [a/(a+b)] — а x [1/(а+Ь)]. 

The first part of the expression is clear. The second comes from the fact that R&D carried out at 
an intensity of a would result in annual costs of $a^ million. Since the expected duration of the race is 
1/(а+5) years the expected cost is found by multiplying the two. 

A similar expression for firm B gives its expected value. 

Each firm finds its optimal R&D intensity level by maximizing its expected value. The two reaction 
functions are given in Figure 3 and they show that if each firm chose an intensity of 3.4, the other would 
choose the same intensity. The associated annual cost to each firm would be $3.4 = $11.6 million 


but the expected duration of the race would be only 1/6.8 years or somewhat less than 2 months. The 
expected value of being in the race at stage III is therefore $3.3 million (10 x (1/2) — 11.6/6.8). 


"This fact is again a consequence of the exponential distribution. 
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Firm B 


RAD Intensity, 





R&D intensity, Firm A 


Figure 


This case is based on the paper "Racing with Uncertainty," by C. Harris and J. Vickers, The Review of Economic 
Studies, 1987, 54, 1-21. 
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Value" if ' win Value if lose 


“АП values are in $ million. 


Value if win 


Value if юзе 





В & D Intensity 


190-108 











іл 


181 


182 





190-108 
R&D Intensities 

Firm A is working on stage 
Firm B is 
working 
on stage 

Table 2 

Expected Values ($ million) 

Firm A is working on stage 
Firm В 15 
working 
on stage 
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THE RACE ТО DEVELOP HUMAN INSULIN 


In May 1976 Eli Lilly & Co., a leading U.S. pharmaceutical house and the dominant force in the 
17.5. insulin market, invited experts in recombinant DNA technology and specialists in diabetes to 
participate in a symposium in Indianapolis. The purpose of the meeting was to share ideas concerning 
the possibility of using recombinant DNA technology to manufacture synthetic human insulin. The 
conference was to serve as the starting gun for a major scientific race. 


Diabetes 


Insulin is a hormone which is manufactured іп ће pancreas and which regulates the blood sugar 
level in the body. Diabetes is a disease which results from an inability to produce sufficient insulin; it 
is, in most cases, an inherited disease. Insulin had been discovered by Frederick Banting and Charles 
Best of the University of Toronto in 1921-22. Banting and Best had gone on to use insulin extracted from 
the pancreases of cows and pigs to treat diabetes patients. 


In the mid 1970s approximately ten million Americans were estimated to be diabetics. The disease 
was the seventh leading cause of death in the U.S., killing over thirty-five thousand people per year. 
Between three and five percent of the world's total population suffered from diabetes. Diabetics were 
classified as either type 1 or type П. Type I diabetics required daily injections of insulin. For type Il 
diabetics, the disease could be controlled through a combination of physical exercise and a strict diet. | 
In the U.S. approximately fifteen percent of the diabetics were classified as type I. 


The amino acid composition of bovine and porcine insulin was very close to that of human insulin 
and most diabetics tolerated animal insulin well. Approximately ten percent of the patients taking insulin 
medication developed allergic reactions to bovine insulin; fewer patients experienced negative reactions 
to porcine insulin since the latter was closer in composition to human insulin than was bovine insulin. 


а 
This case was prepared by Research Associate Раш Barese under the supervision of Professors Adam Brandenburger 
and Vijay Krishna as the basis for class discussion rather than to illustrate either effective or ineffective handling 
of an administrative situation. 


Copyright © 1991 by the President and Fellows of Harvard College. To order copies, call (617) 495-6117 or write the 
Publishing Division, Harvard Business School, Boston, MA 02163. No part of this publication may be reproduced, stored in 
a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopying, recording, or 
otherwise—without the prior permission of the Harvard Business School. 
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Sales of insulin in the U.S. totalled approximately $200 million annually. Eli Lilly held between 
eighty and eighty-five per cent of the market. 


At the Indianapolis meeting, officials from Eli Lilly expressed concern about the supply of animal 
pancreases on the world market. They foresaw increasing worldwide demand for insulin combined with 
decreasing supplies of animal pancreases, leading inexorably to an overall shortage of insulin. These 
projections, together with the problem of allergic reactions to animal insulin in some patients, were the 
factors behind Eli Lilly’s decision to bring the experts together. 


Recombinant DNA Technology 


Recent progress in biology had centered on recombinant DNA technology, popularly referred to as 
"gene-splicing." Advances had been made which indicated that genetic material could be manipulated 
to produce new, medically useful substances. 


The unofficial birth date of recombinant DNA technology is commonly accepted as November 1973 
when the results of ће пом famous Cohen-Boyer experiment were published in the Proceedings of the 
National Academy of Sciences. Stanley Cohen of Stanford University and Herbert Boyer of the 
University of California-San Francisco (UCSF) had teamed up to show that it was possible to use so- 
called restriction enzymes to "cut and paste" pieces of DNA molecules from different cells to create a 
desired "original." The original of the recombined DNA could then be smuggled into bacteria where the 
normal replication processes of bacteria could be used as a photocopying machine to obtain large amounts 
of the desired DNA. 


A remaining problem with the technology was that there was no means by which to read the new 
genetic "document" to verify that the cutting and pasting had indeed produced the desired result. This 
obstacle was overcome in 1975 at one of the laboratories at Harvard University, where a technique was 
developed to read any segment of DNA. Named after two Harvard scientists, the Maxam-Gilbert 
technique served as a "lens" that allowed genetic text to be read accurately and quickly. In terms of 
speed, it represented the difference between copying a book by long hand and making a photocopy of it. 


By 1976 it was possible to cut specific segments of genetic material, paste them to other pieces of 
DNA, smuggle the recombined DNA into bacteria, make copies, and read the results. Because of the 
risks perceived in performing experiments involving DNA, in particular the possibility of inadvertent 
release of dangerous substances, researchers in recombinant DNA technology abided by guidelines set 
out by the National Institutes of Health (NIH); prior approval had to be obtained for any new procedures. 


The developments in recombinant DNA technology led to awakening interest on the part of a 
number of research groups in the possibility of making genetically accurate human insulin. Experts in 
the field understood that such an endeavor would require successful completion of four stages of research. 
The stages were isolation, conversion, cloning, and expression, respectively. The first stage involved 
finding and purifying source material for some version of the insulin gene. The second stage, conversion, 
consisted of turning the isolated material into functional DNA. The third stage, cloning, required slipping 
the insulin gene into bacteria and getting it copied correctly. The fourth and final stage, expression, 
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involved “turning on" the bacterial cells so that they would be tricked into making insulin. Some experts 
thought all this could be done in two years; others predicted ten years. In due course, three major and 
separate research efforts were launched. 


The Competitors 


Genentech Herbert Boyer had worked in the field of bacterial genetics during the late 1960s and 
early 1970s. At that time, his research had been considered out of the mainstream and funding had been 
limited. However, Boyer’s work with enzymes, used to cut and paste segments of DNA, had brought 
him to the forefront of the new field of genetic engineering. His UCSF laboratory quickly became a 
leader in recombinant DNA technology. 


Boyer, whose ambition in high school had been to become a successful businessman, immediately 
recognized the commercial potential of genetic engineering. But although he approached several 
companies about exploiting the technology, Boyer could not find any backers. | | 


During this same period, Robert Swanson, who worked at the venture capital firm of Klemer & 
Perkins in San Francisco, was trying to start a biotechnology company. Swanson, who had a background 
in biochemistry and a degree in management from the Massachusetts Institute of Technology, believed 
that recombinant DNA technology held out great industrial promise but, to date, he had been unable to 
find the necessary scientific support. All the molecular biologists he had approached so far were of the 
view that pure research could not be conducted in a business environment. In January 1976, Swanson 
and Boyer met. 


Boyer and Swanson each contributed $500 as initial operating capital for a new company. While 
Swanson left his job to dedicate himself full-time to their new endeavor, Boyer contacted a colleague, 
Arthur Riggs, at the City of Hope National Medical Center outside Los Angeles. Boyer explained that 
a friend of his might be able to raise money for a project to create human insulin. Riggs and a colleague, 
Keiichi Itakura, were, in fact, planning a different project. They were looking into the possibility of 
synthesizing the gene for somatostatin, a human brain hormone much simpler in composition than insulin. 
Riggs argued that the somatostatin project would pioneer techniques of value to the insulin project. Riggs 
and Itakura both agreed to participate in the venture being put together by Boyer and Swanson. 


On April 7, 1976, Boyer and Swanson incorporated. The company, called Genentech, was launched 
with $100,000 from Thomas Perkins, head of Kleiner & Perkins. With Riggs and ltakura as informal 
co-founders, the first project for Genentech was the somatostatin gene. The Genentech endeavor would 
involve contracts with Boyer's laboratory at UCSF and with Riggs and Itakura at the City of Hope 
National Medical Center. 


University of California-San Francisco Prior to the formation of Genentech, Boyer had 
collaborated with another UCSF scientist, Howard Goodman, on many projects. Like Boyer's laboratory, 
Goodman's lab was also in the forefront of the recombinant DNA field. In early 1976, Goodman teamed 
up with William Rutter, the chairman of the UCSF biochemistry department and a veteran in the field 

of pancreatic research, in a joint quest for the insulin gene. It was Rutter who had been responsible for 
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transforming the UCSF department into a major force in recombinant DNA technology. The Rutter- 
Goodman group became the second competitor in the race to make synthetic human insulin. 


Harvard The third competitor in the race was a group at the Harvard Biological Laboratories 
headed by Walter (Wally) Gilbert. Gilbert had originally been trained as a physicist. In 1954, after 
graduating from Harvard, he had gone to Cambridge, England, to work on his Ph.D. in theoretical 
physics. While there he had met James Watson, the co-discover, with Francis Crick, of the double-helix 
structure. of DNA. After returning to Harvard, Gilbert had spent more and more time in Watson’s lab 
until finally turning to biology full-time. Gilbert was regarded as one of the most enterprising and 
incisive biologists of his generation. In 1975, Gilbert, together with Allan Maxam, a member of his 
group, had worked out the crucial gene identification technique that bore their names. Now Gilbert was 
interested in applying recombinant DNA technology to develop synthetic human insulin. 


The teams at the Indianapolis meeting knew that isolation and conversion, the first two stages in the 
development of synthetic human insulin, were at least technically feasible. To the most optimistic, 
cloning and expression were also "doable." The race was on. 


Preliminary Approaches 


The three teams adopted one or other of two fundamentally different approaches in their initial 
research. The UCSF and Harvard teams planned first to test out their ideas by trying to develop rat 
insulin. The idea was that once appropriate procedures had been developed, they would be applied to 
the creation of human insulin. On the other hand, Genentech chose, as part of the strategy worked out 
with the City of Hope scientists, to begin by synthesizing somatostatin using off-the-shelf chemical 
supplies. The creation of somatostatin would then serve as a model for making human insulin. 


Stage 1: Isolation 


The first task for the UCSF and Harvard teams was to obtain the raw material for isolating the rat 
insulin gene. The best source seemed to be rat insulinoma (tumor tissue) from a lab at the Joslin 
Research Laboratories at Brigham Hospital in Boston, where research on pancreatic tumors in rats was 
being performed. Although both teams had been promised insulinoma from the lab, when Goodman 
travelled to Boston to pick some up, he was told that there was not enough material to spare. Shortly 
after, some members of the Harvard team took a taxi over from Cambridge to the Brigham Hospital and 
returned with insulinoma tissue. The news shocked the UCSF team, which felt that it had been placed 
at a significant disadvantage at the very outset of the race. The UCSF team decided to plunge ahead and 
use a brute-force method for isolation—the surgical removal of the pancreases of over two hundred rats. 
The approach involved a great deal of time and money. 


Meanwhile, the Harvard team had encountered an unanticipated obstacle. The regulation of 
recombinant DNA research had become a political issue in the City of Cambridge, where the Harvard 
labs were based. The National Institutes of Health operated a series of guidelines concerning recombinant 
DNA experiments. Laboratories were classified as P1, P2, P3, or P4 facilities, depending on the 
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physical-safety features in place to prevent escape of disease-causing organisms. To proceed with the 
cloning of the rat insulin gene, a P3 level lab was required. (A P4 lab was the equivalent of a military 
germ-warfare research facility.) Harvard did not possess a suitable facility and its plans to build one had 
run into opposition from the local community. In July 1976, the Cambridge City Council voted to call 
for a voluntary three-month moratorium on recombinant DNA experiments in Cambridge; the moratorium 
ultimately lasted until February 1977. 


While the Harvard team was held up, the UCSF team pressed ahead with its brute force approach 
and managed to isolate some raw material—a few hundred microliters of packed cells in the form of a 
pellet. The Harvard lab had lost its initial advantage. By the Fall of 1976, the UCSF team had moved 
on to the second stage of the race. 


Stage 2: Conversion 


Rumors circulated on both coasts throughout the race. Despite Harvard's setback, the UCSF team 
continued to hear frequent reports of progress there. No one knew whether the rumors were true or not, 
but for all concerned they served to elicit a Pavlovian response to believe the worst and work even 
harder. By January 1977 the UCSF team had successfully completed the second stage and was ready to 
insert the insulin gene into bacteria. 


In the spring of 1977, Itakura at City of Hope succeeded in synthesizing the somatostatin gene, 
which was then sent up to Boyer at UCSF for analysis. This step was equivalent to completion of the 
conversion stage. 


Stage 3: Cloning 


In May 1977, the UCSF team announced that it had succeeded in cloning rat insulin. Indeed, a 
paper to this effect had already been submitted to the journal Science. At this point the UCSF team had 
completed the isolation, conversion, and cloning stages, albeit for rat insulin. At the news conference, 
Rutter predicted that rat insulin could be expressed within six months and that expression of human 
insulin was only a year or two away. 


Meanwhile, at Genentech, Boyer analyzed the somatostatin gene that had been sent to him by Itakura 
` and succeeded in cloning it. 


Stage 4: Expression 


In June 1977, shortly after the UCSF cloning results had been made public, Genentech attempted 
its first expression experiment. It was a failure. By August the experiment had been repeated, this time 
successfully. A news conference was held in Los Angeles on December 1, 1977, to announce that a 
human protein had, for the first time ever, been successfully expressed in bacteria. Stories appeared in 
Business Week and on the financial pages of The New York Times about Genentech, thereby guaranteeing 
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the financing that the company needed for its next project, human insulin. Privately, Swanson had been 
told that human insulin could be made within six months. Swanson knew that being first to make human 
insulin would establish Genentech’s preeminence in the emerging field of biotechnology. 


— Despite early difficulties caused by the absence of a suitable facility, the Harvard team could not 
be counted out of the race. Gilbert had gained access to MIT’s P3 laboratory and in February 1978 the 
Harvard team succeeded in cloning and expressing the rat insulin gene in a single, cleverly conceived 
step. By May 1978 the Harvard results had been confirmed. Around this same time, Gilbert had become 
involved with several venture capitalists and other leading molecular biologists from the United States 
and Europe in setting up a biotechnology company, Biogen, in Geneva, Switzerland. The commercial 
possibilities opened up by the successful expression of rat insulin at Harvard were avidly discussed. 


The approach adopted by the UCSF and Harvard teams left them with a major problem. The NIH 
guidelines required that human genetic material be handled in a P4 facility, the level at which germ- 
wartare research was conducted. No P4 facilities in the U.S. were available to university researchers. 
Since the NIH guidelines said nothing about synthetic DNA, Genentech faced no such logistical 
constraints. 


After searching around the world for a suitable facility, the Harvard team was eventually granted 
access to P4 containment facilities at the British Army’s top-secret Microbiological Research Establish- 
ment in Porton Down, England, for four weeks beginning in September 1978. Biogen paid to send the 
Harvard team over to England, where team members found procedures in the British military facility to 
be much more rigid and formal than those to which they were accustomed. In particular, team members 
had to conduct some of their experiments while wearing gas masks. 


Eli Lilly came to the rescue of the UCSF team on the P4 laboratory issue. The company built a 
P3 laboratory at a Lilly subsidiary in Strasbourg, France. (A P3 facility sufficed since French regulations 
were less stringent than those in the United States.) In September, a member of the UCSF team left for 
France to perform cloning experiments. 


At Genentech, experiments were in progress twenty-four hours a day. In the early morning hours 
of August 24 the Genentech scientists performed a final experiment and checked their results. They had 
produced synthetic human insulin. Genentech scheduled a news conference for September 6. | 


In England, the Harvard team thought that it had won the race. The night of September 7, hours 
before it heard the news about Genentech, the team was celebrating. Verification tests, however, showed 
that the team’s experiments had been contaminated. This was about the time that the news from 
California reached England. The Harvard team tried to recover and repeat its experiments but the four 
weeks in England ended with little in the way of positive results. 


The UCSF team member in France received the news of Genentech’s triumph from Eli Lilly. While 
in France he succeeded in isolating and cloning the human insulin gene. 
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Epilogue 


From the time of the 1976 conference, Eli Lilly had maintained contact with all three teams in order 
to keep itself abreast of the race. Lilly officials were also wary of developments at Novo Industri, a 
Danish company which was the dominant force in the European insulin market. Novo had recently 
perfected a method for purifying porcine insulin that resulted in a virtually nonallergenic product. Novo 
was also rushing to develop human insulin. In early 1978, Novo officials had travelled to San Francisco 
to hold discussions with Genentech. Although the Novo people had not been convinced of the feasibility 
of the recombinant DNA method, the company had gone on to devise method to replace the single amino 
acid in porcine insulin that differed from human insulin with /the human amino acid. Eli Lilly needed 
a way to compete with Novo's formidable product line. 


Genentech signed an agreement with Eli Lilly on August 25, 1978, one day after the confirming 
experiment. Eventually, Lilly assumed responsibility for the industrial production of human insulin. 
Genentech went on to pursue the synthesis of other human proteins for commercial application. On 
October 14, 1980, some 1.1 million shares of Genentech, at a price of $35 per share, were made avail- 
able to the public. Within twenty minutes of trading, the price had risen to $89; by the end of the day 
it had settled down to $7114. Novo Industri was the first pharmaceutical house to reach the European 
market with human insulin, in 1982. Lilly reached the American market first, in 1983. 


During congressional hearings in November 1977, Ronald Cape, chairman of Cetus Corporation, 


a biotechnology startup, had predicted that insulin production by recombinant DNA technology was at 
least ten years away. In fact, it had taken just six years. 


Reference: 


Invisible Frontiers: The Race to Synthesize a Human Gene, by Stephen S. Hall. New York, Atlantic 
Monthly Press, 1987. 
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OPIM 102 


PROBLEM FINDING AND ALTERNATIVE GENERATION 


1/14/97 


PROBLEM FINDING 
e Descriptive Features 
е Problem Discovery 


Reactive/Proactive 
Perception limitations ---Chunking 


е Importance of Goals Values and Needs 
е Problem Acceptance 
Historical models 
Communicated models 
Planning models 
Type Г Туре Папа Type Ш errors 
ө Illustrative Example (Apartment Selection) 
e Challenges in Finding the Right Problem 
GM Locating a New Facility in Brazil 
Planning Family Reunion 
Allocating Time Between Courses 
e Prescriptive Aspects of Problem Finding 
Clarifying Goals Values and needs 
Bounded Rationality ---Framing 
Managing attention—Limitations in Information Processing 


e Steps for Making Better Decisions 


e Creative Problem Solving Problems 


KKS Chap 2 


Problem Context 


Social Context 
Institutional Constraints 
Available Information 





Social Level Problem Context 





Individual Finding living quarters 


Two-Person Doctor- Patient 


Group Family 
Organization 


Improving quality 


Social Facility siting 


indin: 


Problem Identification 
Problem Acceptance 
Problem Representation 





Framing the problem 
properly 


Communicating and 
understanding the 
patient’s illness 


Finding a site for the 
family reunion 


Diagnostic: What’s the 
real problem 


Resolving conflicts 
among competing 
interests 


Figure 2.1. Typical examples of problem-finding activity 
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SELF-ACTUALLZATION NEEDS 


Achieving One’s Full Potential 









ESTEEM NEEDS 
Self-Esteem, Self-Confidence 
Respect of Others 


SOCIAL NEEDS 
Belonging, Friendship, Love 


SECURITY AND STABILITY NEEDS 





Assure Physiological Needs for the Future 


PHYSIOLOGICAL NEEDS 
Hunger, Thirst, Sleep 
Protection from the Elements 






Figure 2.5. Maslow's hierarchy of needs (Based on ‘‘Hierarchy of Needs"' 
from Motivation and Personality by Abraham Maslow. Copyright 1954, 
by Harper & Row, Publishers, Inc. Copyright © 1970 by Abraham H. 
Maslow. Used by permission of HarperCollins Publishers) 







Data from Environment as Perceived 
by Decision Maker (DM) 






Model of Desired | 
Situation 










Model of Reality 


scanning Pr 


Is there a Significant 
Difference? 


NO 


Interrup t Process 


Does DM want to Improve 
upon Difference? 


Acceptance Process 


Is Problem Worth 
Considering Further? 






YES 


ACCEPT PROBLEM CONTINUE SCANNING 


Figure 2.6. The problem-finding process 


LOCATING NEW PRODUCTION FACILITY IN BRAZIL 


PROBLEM POSED BY GM 
e LOCATING A NEW FACILITY IN BRAZIL 


e Q: HOW DO WE FIND THE BEST SITE TO MAKE AUTOS 
GIVEN GROWTH POTENTIAL IN BRAZIL and SA? 


FACTORS PERCEIVED TO BE IMPORTANT 


e CHEAP COST OF RAW MATERIALS and ACCESS TO OCEAN 
TRANSPORT 


e INEXPENSIVE BUT QUALIFIED LABOR SUPPLY 


e LOW TRANSPORTATION COST TO SOURCES OF DEMAND 


GM DECISION: LOCATE PLANT ALONG COAST NEAR RIO 
FACTORS CONSIDERED TO BE IMPORTANT BY GOVERNMENT 
e BALANCE REGIONAL DEVELOPMENT AND GROWTH 

e POLITICAL FACTORS THAT ARE RELEVANT 

e JOBS FOR COMMUNITY 

IMPACT ON GM: 


е HAD TO CHANGE ITS INITIAL DECISION: WITH REDESIGN 
COSTS 


е LOST A LOT IN "PR POINTS" ВУ D.A.D. STRATEGY 


Одо = Ve cide An оло», 00 ло 


Visions of іне Analysis to Gain Connection Visions of 


| | t 
the World | | Understanding Unified | the World 
Vision 





Figure 2: The unifying vision process accomodates the multiple frames of multiple-decision 
makers. 


Figure 1.3 


Mr + Mrs 
Applebaum 
(New York City) 


Steven + Patricia 
Applebaum 
(Washington, D.C.) 





Joyce + Marvin Ronald + Phyllis | 
Spiller Applebaum | 
(Chicago) (Toronto) 





Douglas 
Age 1 





PLANNING FAMILY REUNION 


PROBLEM POSED BY ORGANIZER (RON APPLEBAUM) 


e FIND LOCATION WHICH MINIMIZES DISTANCE TRAVELED 


• NO PRIORITY GIVEN TO ANY FAMILY MEMBER(S) 


RELEVANT CONCERNS 


e PREFERENCES OF DIFFERENT FAMILY MEMBERS 


e TRAVEL LIMITATIONS (E.G. RON’S PARENTS ) 


LESSONS IN PROBLEM FORMULATION 


e IMPORTANCE OF DEVELOPING A GOOD PROCESS 


e CHOICEIS OFTEN A FUNCTION OF DIALOG AMONG GROUP 
MEMBERS 


ALLOCATION OF TIME BETWEEN COURSES 


OBJECTIVE: HOW DO I DO WELL IN OPIM 102? 
SUBOBJECTIVES: 


e HOW DO I DO WELL ON HOMEWORK? 
e HOW DO I DO WELL ON EXAMS? 


e HOW DO I DO WELL IN CLASS PARTICIPATION? 


FORMULATE ALTERNATIVE STRATEGIES? 
e DEVOTE X HOURS BEFORE EVERY CLASS ТО 
READING AND HOMEWORK 
e LAST MINUTE PREP BEFORE EXAMS 


e READMATERIAL OVER LUNCH 


CONSTRUCT ALTERNATIVE/ATTRIBUTE MATRIX 












Individual Characteristics 
Problem t 





Probiem Context 









Social Norms Evoked 







Biological 









Cultural Interactions Evoked 

Social Organizational and 

Education Social Filters | 
Available Information 


Experience 













Models and Theories 
Defining the Achievable 

Scanning Processes 

Information on Status Quo 


Individual Values and Needs 


Cognitive Resources 








Emotional Resources 


Figure 2.7. Descriptive and prescriptive elements of problem finding 


FOUR STEPS FOR MAKING BETTER DECISIONS 


by Paul Schoemaker, Ph.D. 


Step I. FRAME THE PROBLEM RIGHT 
TRAP: Plunging in and starting to solve the wrong problem 


CURE: Back up and look at the issue from multiple angles 


Step II. REVIEW & IMPROVE INFORMATION 
TRAP: Being too sure about the quality of your data 


CURE: Be explicit about what you do and domt know 


Step III. MAKE YOUR CHOICE CAREFULLY 
TRAP: Shooting from the hip when facing complex options 


CURE: Weigh pros and cons in a systematic way 
Step IV. TURN FEEDBACK INTO LEARNING 


TRAP: Rationalizing your choices or denying mistakes 


CURE: Let the facts speak (by analyzing all feedback) 


Copyright 1996. Decision Strategies International, Inc. 


I. Better Problem Framing 


A. Keep Your Eye on Fundamental Goals 


- Distinguish ‘means’ from ‘end objectives’ 
- Challenge your (implicit) reference points 


- Find out how others view the same problem 


B. Broaden The Range of Options 


- Perform an assumptional analysis 
- Construct wide ranging scenarios 


- Assume away self-imposed constraints 


C. Use Creativity Techniques 


- Engage a small group in brainstorming 
- Encourage contrarian thinking 


- Use analogies (such as metaphors) 


Copyright 1996. Decision Strategies International, Inc. 


IV.. Obtaining Better Feedback 


A. Overcome Your Internal Blocks & Defenses 


Keep notes of your assumptions and reasoning process 


Seek the franks views of others to avoid rationalization 


View any failure or success as a learning opportunity 


B. Analyze Existing Data For Hidden Patterns 


Identify areas where you might have access to better data 


Structure these data sets to extract their key lessons 


Use statistical techniques to control for random noise 


C. Develop Better Data for Continual Learning 


Track key predictions and results using periodic reviews 


Identify which data sets may be confusing or confounded 


Design experiments to get clean tests of key hypotheses 


Copyright 1996. Decision Strategies International, Inc. 
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OPIM 102 SPRING 1998 


Creative Problem-solving 
(For class discussion on January 19) 


MISSIONARY/CANNIBAL PROBLEM 


Three missionaries and three Cannibals on one side of a river want to get to the other side 
by boat. The boat holds no more than two people. If there are more cannibals than 
missionaries at any time on land, the cannibals will eat the missionaries. 


Show how you can transport all cannibals and missionaries to the other side without losing 
some of the missionaries in the process. 


COVER THE SQUARES You can cover a checkerboard with 32 dominoes each one 
covering two squares. Suppose you remove the bottom left square and the top right square. 
Can you cover the board with 31 dominoes? 


BIRD-TRAIN PROBLEM Two trains are initially 100 miles apart and are approaching 
each other at 50 miles per hour. A bird flies back and forth at 100 mph between the two 
oncoming trains? How far does the bird fly before the two trains pass each other? 


Lecture Notes: OPIM 102 
Dr. Kleindorfer 
January 19, 1998 


Today: Problem Solving, Prediction & Inference 


Motivation: МАХ E,{U(A,(x, Ө), ..., A,(x, 0))! 


X £ JA 
X= Set of Alternatives 
Ө = State of the world whose exact value is not known at 
the time the decision of which alternative to choose 
is made 


А}, ..., A, = set of attributes of interest in the decision problem 
(whose values depend on the alternative chosen 
and on the state of the world) 

E= Expected Value over the states of the world 

Key Questions: 


А Probabilities or Likelihoods of various states of the world 


e Predicting the consequences Ai(x, 0) 


Individual decision making 


FORMULATION OF OBJECTIVES 








Chapter 2 
IDENTIFICATION OF ALTERNATIVES 
PREDICTION AND INFERENCE Chapter 3 
(JUDGMENT) 
EVALUATION 
Chapter 4 
CHOICE 





LEGITIMATION AND IMPLEMENTATION Chapter 5 


Figure 3.1. Phases of problem-solving process 
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Prediction and inference 


PREDICTORS OR CUES 











Variable of 
Interest: 


7; 






Subject's 
Response: 


НЕ 

















Predicted 
Level: 


Ye 





Predicted 
Response: 
у, 






Figure 3.3. Diagram of the Lens model. 


Note: Y. and Y, are prediction 
models. 
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Table 3.3. Summary of major Lens model Studies 
Type of prediction task R, > К: R, Ra Ca¥ № М, 


Future return on common stock ..01 .78 92 .01 04 21 18 
Faculty evaluations of graduate 49 .54 78: 25 -0i | 111 
students 


Life-expectancy of cancer “О 35 дї. 48  -—.06 3 186 
patients 
Changes in stock prices (23 .80 85 29 -02 s 35 


Mental illness using personality .28 46 77 31 (07 29 861 
tests 


Grades and attitudes in 48 .62 90 56 —.0] 8 50 
psychology course | 

IQ scores on basis of 21 04 91 L.51 143) 15 100 
psychograms 

Business failures using 50 .67 79 53 3| 43 70 
financial ratiog 

Student ratings of teaching 5 91 71 56 -0 1 16 
effectiveness 

Performance of life insurance 13 .43 .60 14 06 16 200 
salesnjah 

IQ scores using Rorschach tests .47 .54 85 51 05| 10 78 

GPA of graduate students 33 69 65 50 O01 98 90 

GPA of average pf’ students 37 .69 — 43 ч— 40 90. 

Changes in security prices 23 .59 55 28 .06 47 50 

Value of ellipses in 84 97 — 89 wee 6 180 
experimental task 

Means 33. .64 .74 39 03 


Source. Camerer (1981). Reprinted by permission. 
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THEREFORE 


[phata -----қ------ бы [ыы 


А 12.3 percent 
reduction in (Definitely) 


traffic fatali- 
ti.s after the 
crackdown on 
speeding ia 


[и] arrant 





Strict enforcement 
of speeding laws 
caused traffic fat- 
alities to fall. 
Human life is always 
worth preserving. 


BECAUSE 
[8] acking 


The greater the cost 
of an alternative 
the less likely it 
will be pursued. 
Human survival is 

a self-evident 
moral principle. 


Figure 1. Ап Example of Argument Structure 


The crackdown on 
speeding was worth- 
while. 


UNLESS 
[а] ebuttal 


. Weather conditions 


were unusually severe 
in 1956 (HISTORY) 

Mass education pro- 
duced safer driving 
habits (MATURATION) 
The 1956 с. "пе 

reflects random fluc- 
tuations in the time 
series (INSTABILITY) 


. Publicity produced 


the reduction 
(TESTING) 
Record-keeping 
changed in 1955 
(INSTRUMENTATION) 
Many speeding offen- 
ders left the state 
in 1956 (MORTALITY) 


‚ 1956 was unrepresen- 


tative of the time 
series (SELECTION) 
Deaths in 1955 were 
extreme and reflect 
regression towards 
the mean of the time 
series (REGRESSION) 
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Lecture Notes: OPIM 102 
Dr. Kleindorfer 
February 2, 1998 
Today: Lecture on Expected Utility Theory 
Next Time: Continuation + Problems Due 
Assignment: February 4 
Today’s Basic Points 
e DPL and Investment Theory Revisited 


e Expected Utility Theory 
e Using EU Theory in DPL 


Expected Utility Theory 
Notation 
Space of all Lotteries L(X) with elements A, B, ... 
X = Outcome Space with elements x, y, z etc. 


Typical Lottery 


Alternative Notation for A = [p,, X;; р», X23 .. Pas Xa] 
Preferences on L(X) 
А>В (ог А>,,В) means DM prefers A to В 
А > В means DM does not prefer B to A or DM weakly prefers A to В 


A~B means DM is indifferent between A and B, i.e. DM will take 
either one and would play a 50-50 lottery to choose between them. 


Axioms and Assumptions of Expected Utility Theory 
Based on von Neumann & Morgenstern (1945): Savage (1954) 
Tastes or Values given by utility function U(x). Beliefs given by 
probability measure p on X. Values and Beliefs are independent 
of each other. Utility depends only on the outcome x. 

vN-M Axioms 
Completeness: DM can compare any two lotteries. 


lransitivity: А > В and В > С implies A > C. (Money pump 
results otherwise) 


Substitution (a.k.a. Independence of Irrelevant Alternat 
It x > y, then for any z, [p, x; (1-p), z] > [р, y; (1-р), z] 





Continuity: If x < y < 2, there always exists some p: 059 < 1, 
such that one can achieve indifference between 
y ~ [р, x; (1-р), z] (Suppose x is “death”) 


Compound Lottery: You should feel the same if you have a 1 in 
12 chance of winning (say $100) or playing the following 
two-stage conditional gamble/lottery: 

Stage 1: Flip a coin. ИН, then go to stage 2. If T, you lose. 


Stage 2: Toss a die. If 6 comes up you win ($100). If 1 to 5 
comes up you lose. 


Implications of the Axioms 


Theorem: If a DM’s preferences “<” on L(X) satisfy the above 
axioms, then there exists a utility function U(x) for that DM 
which represents her preferences in the sense that for any two 
lotteries A, B in L(X): 


A <В ifand only if E{U(A)} < E{U(B)} 


In more detail, suppose: 


А = [ру Xi; р», Хр ....; py Хај) Where р, * ...+p,=1 
В = [а yu Ф, Уз ---. ; 4 У]; where а, +... + 9,71 


If DM's preferences satisfy the Axioms, then the DM prefers A 
to B (A > B) if and only if 


E{U(A)} = р, U(x,) + p,U(x,) +... + p, U(x, ) 
> E{U(B)} = q, U(y,)+ q;U(y;) +... + q, U(y,) 


Note: Two utility functions U(x) and V(x) represent the same 
Иең if and only if: 


yq U(x) = aV(x) + b а > 0; b апу number 
Ve 
ТЫШ Б 
io Fhe Example: U(x) =]- Act V(x) = 3 a «е 
5 ame Percen Note U(x) = .4V(x) - .2 
90 y9 М CA 
Use € ther 
к л to 
ТОС) 
Wh (ch (оу 
laf hē woul 4 
\үке mort 


Why ??: Check that if U(x) = aV(x) + b (with a > 0), then for 
any two lotteries A and B, 


E{U(A)} > E{U(B)} if and only if E{V(A)} > Е{У(В)) 


Thus, we can fix any two values of a utility function without 
disturbing preference orders. 


Constructing a Utility Function for Money 


Arbitrarily assign utilities to two numbers. For example, 
U(-100) =-1 and U(100) = 1. 


Use continuity axiom to specify other utilities. 
Two methods: 


Probability Equivalence: Find the p which makes you 
indifferent between playing the following gamble A and taking 
x dollars for certain. For each such x, you will have 


U(x) = pU(100) + (1-p)U(-100) = p(1) + (1-p)(-1) = 2p-1. 


Now vary x and plot the result. You have your utility function! 


100 
ғ 

А 
1-р 


100 


Certainty Equivalence: Same procedure except now fix p and 
find the x which makes you indifferent. Plot a x varies, where x 
is called certainty equivalent of A, and written x = CE(A). 


5 


Properties of Utility Functions 


Let W be initial wealth. Consider a given Gamble A, with 
expected value E{A}. Compare this with the certainty 
equivalent CE(A) of the Gamble A. We know that: 


E{U(W+A)} = U(W+CE(A)) 


This just says the DM is indifferent between playing the gamble 
A and receiving CE(A) for certain. What is your CE(A) if A is 
the Gamble A = [-10, .5; 10, .5], i.e. win or lose $10 depending 
on the outcome of a fair coin toss? Neglect W or assume it is 
included in A. Then we have: 


Risk Aversion: | U(CE(A)) = E(U(A)) < U(E{A}) 


This says that the DM would rather have E{A} for certain than 
playing the Gamble A. This happens if U(x) is concave. 
Contrast with Risk Prone (convex) or Risk Neutral (linear). 


Risk Premium RP(A) is defined as the difference: 


RP(A) = E{A} - CE(A) 


Example: Suppose initial wealth W = 100 and a DM faces a 
possible loss of 75 with probability .2. Suppose insurance 
premiums are z = .25/$ of coverage. If the DM’s utility function 
is U(x) = х'?, answer the following: 


What is E{A} and CE(A) if A is the Gamble [.2, -75; .8, 0]? 
What is the risk premium RP(A)? How much insurance, if any, 
should the DM buy in this case? 


Exponential Utility Functions 
Very commonly used: e.g., in DPL 


U(x) = а - Бе“, where x is any real number (money) and c > 0 
is called the (degree of) risk aversion. Simplest case a = 0, b = 
1, so that 


U(x) = -e™ 
For this U and any initial wealth level W, let us compute the 


CE(A) of the following lottery A = |р, x; (1-р), -y], where x > 0 
and y > 0. 


E{U(W+A)} = U(W+CE(A)) 
OT 
-p exp|-c(W+x)] - (1-p)exp[-c(W-y)] = - exp [-c(W--CE(A))] 


This yields the equation: 
рехр[-сх] + (1-p)exp[cy] = exp[-cCE(A)] 
Note that this is independent of W. As c gets larger, the 
concavity of U increases and the DM becomes more risk averse. 
Operationally, what this amounts to is that as c gets larger, - 
RP(A) = E{A} - CE(A) 
gets larger, i.e. CE(A) gets smaller. Why? Check this out in 


DPL using the insurance example above, but with 
U(x) = -exp[-cx] instead of U(x) = х!?. 


Lecture Notes: OPIM 102 
Dr. Kleindorfer 

January 28, 1998 
Today: Lecture on DPL and Investment Theory 
Next Time: Lecture on Expected Utility Theory 
Assignment: February 2— Problem Due (see below) 
loday's Basic Points 

e DPL and Steve Revisited 


e Expected Value and Other Decision Rules 
e [nvestment Theory 





Consider the single-period option problem with the parameters 
given in Chapter 2 of Dixit and Pindyck. Thus, assume Initial 
Price = 200, а = .5, DEL = .5 and Investment Cost = 1600. 
Determine the Option Value (the maximum price you should be 
willing to pay to be able to keep your investment alternatives open 
until after the Market Development is known) as a function of the 
value of q, DEL and the cost of capital. [Recall that the discount 
rate for computing NPV is given by DR = 1/(1+RHO) if RHO is 
the cost of capital]. Discuss your results briefly and provide an 
intuitive rationale for them. 














BANANA'S COST OF OFFER TO STEVE 


_EYF Succeeds 2 
0.8 0.85'50 ~ 


Steve Accepts $ 
/ 9 -5 ~A EYF Fails | 











ORIGINAL „> 0.2 0.85*25 ~ 
Steve Reiects 
__ 1 1*Current Price 
| | | EYF Succeeds | a 
Steve Accepts CS 0.8 1*.72*50 
E -0.53 ~S EYF Fails | Ж 
| 0.2 1+0.72°25 -0.28*30 + 0.28*25 — 
|, PUT OPTION ~ 
а=. 
Зіеуе Кеіесіѕ < 





0 1 + 1'Current Price 





\ PUT OPTION 


53 "7 


\ EYF Fails | а 


0.6 28*30 ^ 


| 


_EYF Succeeds ғы 


EYF Succeeds 4 
0.4 15*HIGH SHARE PRICE 


\ EYF Fails 4 


0.6 .15*LOW SHARE PRICE . 


0.4 .28'HIGH SHARE PRICE ^ 


OPIM 102: Decision Processes 
Dr. Kleindorfer 
January 26, 1998 


Today 
e Hangman’s Problem 
e DPL and Decision Theory 


e Southern Electronics Case Study 


Next Time: 1/28/98 
Investment Theory and DPL 


Note: A couple of Homeworks Coming Up 


Assignment Notes: OPIM 102 
Dr. Kleindorfer 
January 21, 1998 


Today: Lecture on Decision Analysis 


Next Time: Southern Electronics I & II (2-page write-up due) 


Todav's Basic Points 
Influence Diagrams 
Basic Concepts Related to Decision Trees and Choice 
Expected Value and Probability Theory 
EOL = Expected Opportunity Loss = Expected Regret 
EOL is identical to Bayes = Max Е! Value} 


Expected Value of Information 


Product Sales1 
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Same as before. except now we define a variable (License-Fee) 
in the associated influence diagram, which we can use for sensitivity analysis. 
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Lecture Notes: OPIM 102 
Dr. Kleindorfer 
February 4, 1998 


Today: Continuing Discussion of Expected Utility Theory 


Next Time: Problems Due on EU Theory 
Please do Spreadsheet Wars Case—BUT NO 
WRITE-UP REQUIRED!!! 


Today’s Basic Points 


Expected Utility Theory 
Risk Taking and EU Theory 
Using EU Theory in DPL 
Examples 


Construction of Utility Functions 
Note: Homework: Can really only fix 2 points 
Results from Class: PE and CE Methods 
Concave in Positive Domain 
Convex in Negative Domain 


Certainty Equivalents and Risk Premium 


Risk Preferences 
Risk Averse 


Concave Utility Function (over relevant domain) 
Positive Risk Premium [RP(A) = E{A} - CE(A) > 0] 
DM likes E{A} better than A 
Examples: 
U(x) =a-be™ , where b > 0 and c > 0, x arbitrary. 
U(x) = log[a + bx], where b > 0 and atbx > 0 
U(x) = х“, where O<a<1andx>0. 

Risk Prone 
Convex Utility Function (over relevant domain) 
Negative Risk Premium 
DM Likes Gamble A better than E{A} for certain 
Example: 
U(x) =x? , where B^ 1,x > 0. 

Risk Neutral 


Linear Utility Function (U(x) = ax + b, a> 0) 
DM indifferent between Gamble A and E{A} = CE{A} 


Example: An investor has the opportunity to invest $1000 in so- 
called “Act of God Bonds”, which will provide the following 
payments (to help provide reinsurance against Natural Disasters 
of magnitude greater than $1 Billion for a specified set of 
Insurance Companies in a Given Region for a Given Hazard, 
e.g., Hurricanes in Florida): 


30% return if no Disaster occurs in the next 12 months 
10% return if one Disaster occurs in the next 12 months 
-30% return if two Disasters occur in the next 12 months 
Lose everything if three or more Disasters occur in the next 
12 months 


Based on the last thirty years of history of Natural Disasters, the 
investor believes that the probabilities of the above events are: 


No Disaster ‚© 

One Disaster .16 
Two Disasters .02 
> Two Disasters .02 


What 1s the Investor's Certainty Equivalent of a single Act of 
God Bond? Of 2 such bonds (1.е., investing $2000)? IF: 


e The investor is Risk Neutral 
e The investor is Risk Averse with Exponential Utility 


Function with Risk Aversion = .01 (.i.e., with Risk 
Tolerance = 100) 


Invest the $1000*N at Interest R 


AoG Bond Decision 


Buy N AoG Bonds 4 


BASE Case 


К 
RT 


|) 


(1+R)*N*1000 < 
_ № Disasters Occur _ rad 
8 (1.3*N*1000) 
One Disaster Occurs rl 
.16 (1.1*N*1000) 
Two Disasters Occur ^ 
.02 (.7*N*1000) 


210 
|00 


More than Two Disasters Occur a 


.02 0 


Portfolio Example: Now suppose the investor is interested in 
investing $5,000 in a “Portfolio of Bonds” with two possible 
Bonds: The Act of God Bonds described above and a risk free 
government bond yielding a return of 8%. How much should 
the investor invest in each of these types of Bonds under the 
above two risk preference conditions? 


As an example, consider the NPV of investing $2,000 in AoG 
Bonds and $3,000 in risk free government bonds if 1 disaster 
occurs (so the return from each AoG Bond is 1.1): 


NPV = (3000* 1.08) + (2000*1.10) = $5,440 


In general, for any N (the number of AoG Bonds purchased), we 
have 


NPV = (5-N)*(1000*1.08) + (N)*(1000)*(1 + RET) 


where RET is the return associated with the particular state of 
the world (number of disasters) which occurs. 


Portfolio Example: Now suppose the investor is interested in 
investing $5,000 in a “Portfolio of Bonds” with two possible 
Bonds: The Act of God Bonds described above and a risk free 
government bond yielding a return of 8%. How much should 
the investor invest in each of these types of Bonds under the 
above two risk preference conditions? 


Portfolio Choice 


Number of 
Disasters 


D=0 











Number of Disasters 
D=1 


Number of Disasters 
· Dz2 


Number of Disasters 
D=3 or More 


Number of Disasters 











| ): A DM is given the choice 
between various insurance policies for a particular risk (say an 
auto accident) which he believes has a chance of .01 of 
happening over the duration of the insurance policy. In the 
event of an accident, the DM expects damages of $5000 and 
otherwise (with probability .99) the DM expects no damages. If 
the DM buys full coverage insurance, then the insurance 
company would pay all damages. 


suppose the insurance premium at which the DM is exactly 
indifferent between insuring (with full coverage) and not 
insuring at all is $100. Suppose also that the DM has 
preferences which can be (at least approximately) represented 
by an exponential utility function U(x) = exp[-cx]. 

What is DM's risk aversion c (or risk tolerance 1/c)? 


Should the DM buy a policy with premium $70 but which has 
a deductible of $1000? 


Note: From the stated indifference example, we know: 
p exp[-cx] + (1-р) ехр[ су] = exp[-cCE(A)] 

.01 exp[-c(-5000)] + (.99) exp[c(0)] = 

% exp[-cCE(Full Coverage at $100)] 

or .01 exp [5000c] + .99 = exp[100c] 


which implies с = .0002555 or RT = 1/с = 3913 


Accident 


<. 


/ 0 -1000 
Buy Insurance with $1000 Deductible Ж 
-70 
No Accident ea 
Insurance 99 0 
Decision 


Accident 


Cy 594 -5000 < 


Buy No Insurance 


No Accident 


99 0 <. 
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Today’s High Points 
Certainty Equivalent and Risk Premium 
Risk Preferences 
Exponential Utility Functions 


Examples Including Spreadsheet Wars Case 





Sam's Decision Problem 


Low 
0.5 9 
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Settle Now e Medium —= / 04 340 7 
0.3 8 Forward Loses E Medium — 
Goto Court (— 0.8 6 02 12 
High = 15 = | 
0.2 10 | High 
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Focus Sues Forward N 


Forward Wins 
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0.2 0 
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Lecture Notes: OPIM 102 
Dr. Kleindorfer 
February 11, 1998 


Today’s High Points 


e Revisit WTA & WTP 
e Various Puzzles for Risky Choice 


e Prospect Theory and Alternative Theories of Choice 


For Next Time: Please complete and pass іп 
Intertemporal Choice Experiment 


№) 


Some Examples 


1. Certainty Equivalent and Risk Premium: Suppose a DM has 
the following utility function 


U(x) = Min[x, .5x+40] 
Show that U(x) is concave with a kink at x = 80 (plot it!). Thus, 
we know that the DM with this utility function is risk averse. 
Illustrate this by computing for W = 0 and the lottery A: 
А = [.6, 50; .4, 100] 
a. E{A} = .6(50) + .4(100) = 70 
b. СЕ(А) (= 66) and RP(A) (7 4 > 0) 


U(CE(A)) = .6U(50) + .4U(100) = .6(50) + .4(90) = 66 






.oAcavt = fi 5 K avec 
Ар 70 


2. Maximum Willingness to Pay (WTP) fora DM: The 
Maximum a DM is WTP to play a lottery is found by 
considering the impact that this payment will have on the 
ultimate outcome of the lottery. Thus, if the DM pays 10 to 
play the lottery 


A = [.6, 50; .4, 100], 


then this is in effect the same thing as playing the lottery A-10, 
obtained from A by reducing every outcome by 10: 


А-10 = [.6, 40; .4, 90] 


To determine the maximum which the DM should be WTP to 
play a lottery like A, we simply find the largest payment the 
DM could make and still not be worse off than doing nothing. 

If W is the DM’s initial wealth level, this amounts to solving the 
following equation: 


ОСУ) = E(U(W + A - WTP)} 


Consider the lottery A above and assume the DM has 
preferences which can be represented by an exponential utility 
function U(x) = -ехр[-сх], where x = the DM’s degree of risk 
aversion. Write an expression characterizing MAX WTP(A). 
Write another expression characterizing CE(A) and show (if you 
can) that CE(A) = MAX WTP(A). Can you explain why this is 
so? Do the same thing for U(x) = x? and show when W=100 
that CE(A) > WTP(A). 


-exp[-cW] = -.6exp[-c(W+50-WTP)] -.4exp[-c(W+100-WTP| 
-exp[-c(W+CE(A)) = -.6exp[-c(W+50)] -.4exp|-c(W+100)| 
U(x) = SQRT(x), W = 100 CE = 69.14, WTP = 68.57 


Бф. ww = CPLA) 
M WE FERA 


2. Minimum Willingness to Accept (WTA) to play a lottery. 
In a similar fashion, if the lottery is not a desirable one, and you 
are given a free choice whether or not to play it, then you may 
have to be paid something to play it. Consider the lottery 


В = [.5, -50; .5, 0] 


Then if you are paid something to play B, say you are paid 10, 
this has the effect of adding 10 to each outcome, resulting in the 
lottery B+10: 


B+10 = [.5, -40; .5, 10] 


The minimum (WTA) you should accept to be paid to play B 
would be the amount which would just make you indifferent 
between the lottery + payment and doing nothing. If W is your 
initial wealth, this results in the equation: 


ОСУ) = E(U(W + B+ WTA)! 


Suppose you are risk neutral (1.e., you have the utility function 
U(x) = x), what is the minimum WTA you would accept to play 
the lottery B given above? In this case, show that this is the 
same as -CE(B). Would this change if U(x) = -exp[-cx]? If 
W=100 and U(x) = х”? Does WTA(B) = -CE(B) always? 


ANSWER: This holds for exponential, but not for SQRT. 
For example: 


-exp|[-c100] = 
-.5exp[-c(100-50+WTA)] - .Зехр|-с(100-0--УУТА)) 
or 


2 = exp[-c(WTA - 50)]  exp[-cWTA|] 


or equivalently 


2 = exp[-c(-CE(B) - 50)| + exp[cCE(B)] 


Wm EA 


g Portfolio Choice: Suppose a DM with initial wealth W and 
an investment budget B can invest in n different securities, i = 
l,..,n. Suppose the return from each security is x;R;, where 
К; is a random variable with a known distribution. Then the 
portfolio choice problem can be represented as follows: 


Maximize E{U(W + > ХК, )} 


Subject to: Y X; < B, x; > 0, for all 1. 


As an example, consider the Act of God Bonds we discussed 
last time. In this case, we had two securities, one of which had 
a certain return and the other had a random return. 


What do you think happens to the investment/portfolio choices 
of the DM if the DM is very risk averse? I.e., what happens in 
the case U(x) = -exp|[-cx] if c is large? 


ANSWER: DM starts moving to a less risky portfolio. If 
we assume normal returns, then we can actually plot 
U(mean, sigma) EU contours and the efficient frontier for 
different portfolios. 


Paradoxes for Choice Under Uncertainty 
eAllais Paradox 

The Certainty Effect 
eEllsberg Paradox 

Subjective Probability & Ambiguity Aersion 


Ellsberg: Urn has 30 red balls and 60 black or yellow balls 


Situation I 
Choice A Win $100 if a red ball is pulled 
Win $0 if a black or yellow ball is pulled 
Choice B Win $100 if a black ball is pulled 
Win $0 if a red or yellow ball is pulled 
Situation II 
Choice C Win $100 if a red or yellow ball is pulled 
Win $0 if a black ball is pulled 
Choice D Win $100 if a black or yellow ball is pulled 


Win $0 is a red ball is pulled 


Most people prefer A to B and D to C. But this problem has the 
same structure as Allais (with a common outcome). Easily Seen 
that if A > B, then С> D. What's up? 


pl = Probability of a Red Ball 
p2 = Probability of a Black Ball 
p3 = Probability of a Yellow Ball 


Sheet1 


Allais Paradox: First Tree Shows the Y Lottery = $1 Million with Certainty 


|__ || 

Ен ЕЕ 
_____|Спокеда | | 12 1|] 11] | | ||| | | 
_____|Спокев | 0 5 1 | | || || | — 
Lou ro | | 3 1 0 1 |} 
—FP HIE — S 


2 ||| 
k à 
ТІ 
| | 
= 


БГ 
| ИЛА N 
|g 
g | 


| | 
e 
anh — = 


© 
E 
м 
| 
D 
: 
= 
< 
E 
м 
м 
№ 


s 
ГА 


КЕ 
у И E a. и маты 





Раде 1 


Sheet1 


| 


ns i 





АУБ ,C 
БЕЛГІЛІ wit 
Ead diy Theol 


Page 1 


Framing and Response Mode Effects 
e Kahneman and Tversky Examples 
e Other Examples of Gain-Loss Framing 


Rare Plague in Small Village (600 people 





Program A: 200 lives will be saved 
Program B: 1/3 chance that 600 will be saved, 2/3 
chance that 0 will be saved. 


Program C: 400 lives will be lost 

Program D: 1/3 chance that 0 will be lost, 2/3 chance that 
600 will be lost. 

Most say, A> B and D > С. 


Demand Side Management Programs 





Program A: 200 MW will be saved (for certain) 
Program B: 1/3 chance 600 MW will be saved, 2/3 

chance 0 MW will be saved 
Programs C and D as above. 


Un-Employment: Program A: Save 200 jobs for certain, 
Program B: 1/3 600 saved, 2/3 0 saved. Etc. 


9 


Prblim № EUT 
Preference Reversals 


Gamble | Probability | Amount | Probability | Amount 
of Winning | 0 win |of Losing |0 Lose 


| PW _ SW РІ. $L 
ww"^ L-Bet [33 846 — |.67 52 
жы H-Bet |.99 54 01 51 


Most subject say Н is more attractive to them. 
Most subjects say WTA more for L than H. 
What’s a possible explanation? 


4 > |. 


(у, + (ft) = Су 1)7 EU (Wo ы = U(uh + WTA (9) 
WTA- (H) > WTA(L) 
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Alternative Descriptive Models (Regret) 


y & Kahneman, 1979 





Positive Lotteries: L = [х; p; y] (for ену 
VL) = n(p)[v(y) - v(x)] + у(х) 


Negative Lotteries: L = [-x; p; -у| (for y>x>0) 
V(L) = л(р)[Уб-у) - v(-x)] + v(-x) 


Mixed Lotteries: L = [-х; p; y] (for any x, y>0) 
V(L) = n(p)v(-x) + n(1-p)v(y) 


Key Points: Shape of V 
Shape of л(р) 
Reference Point & Framing 


ә 
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Lecture Notes: OPIM 102 
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Today’s High Points 


e Prospect Theory and Alternative Theories of Choice 
(Continued) 


e Introduction to Intertemporal Choice 





n 
Оба, th, ..., Xn, tn) = 2, ф (6) VG:) 
i=] 


Reference Point Effect and Value Function 





Discounting Function ф 


o(t 





Applications to Energy Efficiency 


Time-Preference Experiment 


OPIM 102 


February 18, 1998 


“Choice Over Time” 


H 1 1 
Choice Over Time — .— 1. 1 1 .. Ti Pref d Di tin 
* Why do we discount the future? 
“ $100 today is worth more to me than $100 next year” 
* Normative Approach: Discounted Utility Theory (DUT) 
" A bar of chocolate today is worth more to me than a bar of 


" A house which I can start living in now is worth more to me 
than a house 1 can move into next year" 


* Descriptive Approach: 
A, Behavioral Model of Intertemporal Choice 


* [ntertemporal Choice under Uncertainty 
- Example: Risk Mitigation Measures 





Assume you will get $100 one year from today. 


“ Samuelson (1937), Koopmans (1960) 


* Consumption streams: C = (£4...,c5) and c*=(c,*,...¢7") 
How much does it worth to you today? 


* Discountfactor § = — where г: discount rate (e.g. 109%) 
+r 


-— НЕ ЈЕ Í Е Present Value: PV(X,)= + - y ^ SaaS paese e bere He 
| | r 
Xo Xi Х: f | 
Discount Factor: & — — Уг у uc, ) > ха -и(с,*) 
j | Е (1 -ry = ; 


* Critical Assumption: Stationarity of Discount Factor 





* Suppose you were asked to give up 5100 one year from 
today. How much would you be willing to give up rwo * Hypothesis: If (x, t)-(y. 1*). then (x, +) (у, 1"+)) 
years from today instead of one year? 


* If you were asked to give up $100 ren years from today. * Individuals discount the near future more heavily than far 
how much would you be willing to give up eleven years future 


: ORDER 4 | 
from today instead of ten years? e.g, the discount rate for t=] year to t=2 years is higher 


than the discount rate for 1=10 years to tz11 years 


* ер. 








* How much money would you be willing to pay today in 
order to get $300 one year from today? 

* How much money would you be willing to pay today in 
order to get $30 one year from today? 


LA РІ е 


L 30 | 29 | 
[+ 1  ( ож 





* Large amounts suffer less proportional devaluation than 
small amounts. 





* How much would you be willing to get today, rather than 
getting $100 one year from today? 

* How much would you be willing to get two years from 
today, rather than getting 5100 one year from today? 


| га S100 г= 40 9 51 
Г] e.g. Е = 


* Highly averse to delaying the scheduled consumption 
Relatively indifferent to speed up 


* $7 Record Store Gift Certificate 


Time Relay WTA \Speada WP 











* Value Function 
= reference point: status quo liè current consumption level) 


= Seeper in losses than gains: у(х) -v(-x) 
e.g. м(520)< -v(-$20) 
concave in gains, convex іп lusscs (pain/loss asymmetry) 


— less concave for outcomes with greater absolute magnitudes 
more elastic for outcomes with larger absolute magnitude 
e.g. v(20) = v( 259i 1) implies vi 200 v( 2501) 

(absolute magnitude ейел! 


Aone се суі т due VUEN ob 
д «wed с 2 & time higher м 


amount offered to speed u 


a 





* How much would you be willing to lose one vear from 
today, rather than losing $20 today? 
* How much would you be willing to get one year from 


today, rather than getting $20 today? 






LS A 


Mem e eed Қа 





* Losses are devaluated at a lower rate than gains. 





* Loewenstein and Prelec (1992) 


1 
М(х) = > Фа) -v(x)m > — vix) 
Lo Мото 


Discount Function g(t) —— . 1. 


* Hyperbolic discounting instead of exponential 
This means: 
- Discount rate is not stationary over time 
- Rate of ume preference decreases over time 


* Hyperbolic Discount Rate 


= 0-1) 6 





* Framing is important 


* Reluctant to invest in alternatives with high initial costs 
and with benefits over a long period of time 
= energy-efficient appliances (A/C, fridge, ete.) 
- protective measures (lock, storm shutters, etc.) 


DM ud w^ (196 B hore © ORC Twit 
BAS p)? 
Py ped Wen (>> 


Ql: Ном do individuals make the tradeoff berween the 
investment cost and the stream of future benefits in 
RMM investments? 


Q2: Does the magnitude of the investment make а 
difference in individuals' choices over time? 


ОЗ: Are uncertain outcomes treated differently than the 
certain ones? 





* Deadbolt Lock & Susering- Wheel Club: 
WTP (1 year) and WTP (2 years) 


* Quake Measure & Energy-Efficient Fridge: 
WTP (5 years) and WTP (10 year) 


T 
1 
WIP = У 5'-WIP where Pe 


* Forinmance, WTP (1 усаг}= 50 and WTP (2 years}<80 for the lock: 


80 = 50-4. —L—.50 = pel 
l+r 


Proiectise Bavestments бөккен 


* Features of Protective Measures 
An Initial Investment Cost 
Lower Expected Loss over along period of time 


* Examples: 
Deadbolt locks, lightning rods, smoke detectors, 
bracing the foundation against a quake 


* An Intertemporal Problem Under Uncertainty 


Qi:  Misperception of Future 


different time horizons 
1-2-3 years 
5-10-20 years 


Q2:  Magnitude of Investment 
different items 
inexpensive: lock and club 
expensive: bracing the house foundation 


Q3: Role of Uncertainty 
risk mitigation measures vs. energy-efficient fridge 


The implied Discount Rates (r) in Each Scenario 





а агына ш ы 1241 
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Q1: Misperception of Future: How do individuals make Q2: Does the magnitude of the investment make a difference in 
the tradeoff btw the investment cost and the stream of future choices over time? 
benefits? At what rate do they discount the future? 





Q3: Are uncertain outcomes treated differently than the 
certain ones? 





— increased likelihood of a break-in 


— safety concerns: no price difference 
= “I'm not willing to (spend) anything more (for two years). 
Protection is served equally in both frames." 





Policy Implications 














PERCENTAGE OF SUBJECTS CONSISTENT * How to encourage cost-effective RMM investments? 
WITH EACH DECISION PROCESS CATEGORY 
(омер * Information Presentation 
Relatively іле — Myopie One-Shot Two Sta Using statistics to provide motivation for discounts 
THaccamting Behawdor — investment Procrs 
та г> 20 | 
па 1 vear-2 youre 14% 57 ГЛ 21% * Incentives 
Insurance Policy Discounts 
Сы П yem- 2 Усага 14% aw 125% 8% Increasing Deductibles- framing 
Payment Plans for КММ: long-term loans and rentals 
„шай $ wears [Oy curs 35% 57 4 рч BU 
а“ ———_———————————————— = ж 
гайре 5 yars- 10 yom 504 ia RM. 29% Regulation . 
Meeting the Suandards io be “insurable” 
nS EE SE === " " T „= 
= average Шр bed discount rete within that category ) Industry-based Regulation 


Coun Decisions (Набу system) 
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Today’s High Points 


Discussion of Chapter 5: Legitimation 


What does it mean? 
Why is it important? 


Begin Discussion of АЛАНА eural Nets 
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Figure 5.1 


Prescrip 
Qo biasing 





Problem Context Including Available Resources 


Problem Type. 
Distinguished by 

· Structure 

. Values Evoked 

. Data Needs 

. Complexity 


Decision Maker | Solution 
| Approaches 


· Experience-Skills mE 
* Values & Needs | —  - Intuitive 


· Goals | * Formal 
· Cognitive Style |  - Computer-Based 


Legitimation Criteria 


· Optimality 

· Understandability 
· Defendability 

* Accuracy 





Factors Influencing 
Decision Process 


Generation and 


Evaluation of Alternatives 


Final Choice 
by 
Mike 











Quantitative Scaling 
Factors 


PN 


Statistical 


Final 
Decisions 
by 
Jane Officer 


Regression 
Package 





Relative Rankings (Bootstrapping) 





of Applicants 


Figure 5.4 
gae o cm 


ош ines of Portfolio DSS! 






Input/Output of Queries 
Reports Display 
Plots/Statistical Analysis 


Graphics Terminal 


Printer Computer 
- Reports for Trust Officer | PDSS Software 


` Client Reports 


ad on Keen and Scott-Mo 


· Data Base Masnagement | 


Account 
Records 





Security 


Data Base 
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Bernie Munch 
inputs capacity 
plan 


CAPACITY PLANNING 










Demand 
Data and 
Forecasts 








Production 
and Distribution 
Cost Data 













For the given capacity 
plan, a good production 
plan is determined 


PRODUCTION/DISTRIBUTION 
PLANNING 





Transportation 
Cost Data | 








Bernie considers 
changes to his 
capacity plan based 
on its cost/flexibility 

and service implications 


EVALUATION 





Data Files 


FINAL PLAN FOR ACTION 


Figure 5.6 


iV ick’s Problerr 


Costs | | Payoffs Probabilities 
of | from Oil of Success 
Test | or Dry-Field or Failure 







Oliver 


Slick Decision 


Analysis 
Routine 


204 Individual decision making 


Table 5.7. Decision-making characteristics of different cognitive styles 





Receptive Suspend judgment and avoid preconceptions. 
Be attentive to detail and to the exact attributes data. 
Insist on a complete examination of a data set before 
deriving conclusions. 


Preceptive Look for cues in a data set. 
Focus on relationships. 
Jump from one section of a data set to another. 
ке Build a set of explanatory precepts. 


Systematic Look for a method and make a pian for solving a problem. 

Be very conscious of one's approach. 

Defend the quality of a solution largely in terms of the 
method. 

Define the specific constraints of the problem systematically 
early in the process. 

Discard alternatives quickly. 

Move through a process of increasing refinement of 

Conduct an ordered search for additional information. 

Compiete any discrete step in analysis that one begins. 


Intuitive Keep the overall problem continuously in mind. 

Redefine the problem frequently as one proceeds. 

Rely on unverbalized cues, even hunches. 

Defend a solution in terms of fit. 

Consider a number of alternatives and options 
simultaneously. 

Jump from one step in analysis or search to another and 

Expiore and abandon alternatives very quickly. 





Source: Adapted by permission of Harvard Business Review. An excerpt from 
“How Managers’ Minds Work'' by James L. McKenney and Peter G. W. Keen, 
vol. 52, no. 2 (May/June 1974). Copyright © 1974 by the President and Fellows 
of Harvard College; all rights reserved. 
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Lecture Notes: OPIM 102 
Dr. Kleindorfer 
February 25, 1998 
Today’s High Points 
e Discussion of AI/GA/Neural Nets 


e In General 
e Introduction to Prisoners Dilemma Game 


e Discussion of Examples of AI in Practice 


e Brief Discussion of Mid-Term Scheduled for 3/4 


Financial Applications 





• Loan Evaluation and 
Bankruptcy Prediction 
(Regression-Based Causal 
Forecasting [Marquez 1992]) 


• Bond Rating 
(Judgmental Forecasting) 
‚ Stock Price Prediction 
(Time-Series Forecasting) 











Single Neuron 











Problems with 





(d Backpropagation _ 


• Slow training 

» each iteration involves a 
relatively complex 
two-stage process 


› local minima 
• temporal instability 
» biological implausibility 





Problem of Finding the 


_ Optimal Network 
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* Architecture (# of hidden nodes) 
e rules of thumb based on 
numbers of input nodes and 
observations 
е constructing multiple networks 
with different numbers of 
hidden nodes 
• Determining when to stop 
training 
• training until convergence 
e training for different durations 








Genetic Algorithms 


“Search algorithms based 
on the mechanics of 
natural selection and 
natural genetics. They 
combine survival of the 
fittest among string 
structures with a 
structured yet randomized 
information exchange to 
form a search algorithm 
with some of the 
innovative flair of human 
search” (Goldberg 1989) 


| Genetic Algorithms 


e start with a "population" by 
randomly generating a set of 
possible solutions 


e structures in the population 
are rated for their effectiveness 
as domain solutions 
(their "fitness") 

e on the basis of these 
evaluations, a new population 
of candidate solutions is 
formed using specific *genetic 
operators" such as 
crossover and mutation 


Combinations of Genetic 





Algorithms and Neural Nets 


® supportive combinations 

- genetic algorithms to search for the 
optimal set of inputs for the neural 
network 

- example: Guo and Uhrig (1992) 
inputs to a neural network used for 
fault diagnostics in nuclear power 
plant 


e collaborative combinations 
- genetic algorithm as the learning 
method within the neural network 
- genetic algorithms as a search 
mechanism to generate the 


appropriate topology of the neural 
network 


Comments on the Mid-Term Examination: March 4, 1998 
OPIM 102 
Dr. Kleindorfer 


The results for most of the class for the Mid-Term were Good. The overall results were: 


Mean = 82.2 
Standard Deviation = 12.5 


Some comments on the Questions follow: 


1. This question was taken directly from Chapters 2 and 3 of KKS. The only real 
problem here was that some of you did not connect your answers to 1b to your 
answers to Та. For example, if you answered 1a with *overconfidence" but answered 
Ib with "brainstorming" (which will not do much if anything to debias over- 
confidence) then you did not answer the question I asked. 


2. The Allais and Elsberg Paradoxes are stated very clearly both in KKS and in my 
lecture notes. What some of you did was to only describe the indicated paradoxes in 
words. That is you did not actually present the basic lotteries which form the basis of 
these paradoxes. Also some of you did not describe why either of these was actually 
a paradox in that it seems a reasonable choice but violates EU theory. 


Concerning 2b, here again I was looking for something like the high bet and low bet 
example (it didn't have to be the exact example, but something like it) given in KKS 
(p. 152) in the discussion of "preference reversals" (which was also reprinted in my 
lecture notes. Some of you missed this question altogether. 

3. І had worked this problem out in so much detail so frequently that I was very 
surprised that some of you still did not solve it correctly. Here is what I was 
expecting: 

3a: 

U(W + CE(L)) = .SU(W+10) + .SU(W-10) 

which for the utility function given leads to: 


-exp-c[100 + CE(L)] =-.5 exp-c[110] -.5 exp-c[90] 


Simplifying this (although you did not have to do this to get full credit for this problem) 
by cancelling exp(-100c), we obtain 


2 exp(-c CE(L)) = exp(-10c) + exp(10c) 


2.7 


3b: CE(L) is negative. This follows since the utility function given is concave and 
therefore represents the preferences of a risk averse DM. Thus, for any lottery including 
L in part a, the risk premium RP(L) = E(L) — CE(L) > 0. Since E(L) = 0 for the lottery in 
part a, this implies immediately that CE(L) < 0. 


3c. 

U(W) = ЗОО + 50 — WTP) + .7U(W + 10 – WTP) 

Using the form of U(W) given and multiplying both sides by exp(-cWTP), this reduces to 
-exp(-c[W + WTP]) = -.3 exp(-c[W+50]) - .7 exp(-c[W+10]) 


which is the same expression as that which characterizes CE(L) except that WTP is 
substituted for CE(L). This means that WTP = CE(L) (which by the way is not true in 
general but only for the exponential utility function). Thus, again applying the logic of 
part (b), we know thet the DM is risk averse so that RP(L) = E(L) — CE(L) > 0 for any 
lottery L. For the particular lottery L of interest we can easily compute that E(L) = .3x50 
+ .7x10 = 22. Since RP(L) > 0, we conclude from all of this that WTP = CE(L) < E(L) = 
22. 


4. Here again, I had expected that you would actually compute the answer to this 
problem, based on our class discussion. If you showed an understanding of the 
general idea, I also gave at least partial credit. 


4а. For this problem, the key 15 to show that the RP for the lottery А = [4000, .8; 0, .2] is 
positive. Intuitively this is so (and I gave you full credit if you gave the intuitive 
argument) because the DM prefers 3000 for certain to A which means that the DM would 
certainly prefer 3200 for certain to A. But E(A) = 3200, which means that the DM 
prefers the expected value of the lottery A to the lottery, i.e. E(A) > CE(A), which is 
nothing other than saying that ЕР(А)>0. In symbols: 


CE(A) « 3000 (since the DM prefers 3000 to A) 

3000 « 3200 (obvious) 

3200 = E(A) (by computing .8x4000 + .2x0) 

Therefore, RP(A) = E(A) — CE(A) = 3200 — CE(A) > 3200 — 3000 > 0. 


4b. A similar logic establishes that since the DM likes the lottery, call it B, better than 
losing 3000 for certain, the DM will also like the lottery better than losing 3200 for 
certain. But since 3200 = E(B), we see (similar to part (a)) that RP(B) = E(B) – CE(B) 
« 0. In symbols. 


CE(B) > -3000 (since the DM prefers the lottery B to losing 3000 for certain) 
-3000 > -3200 (obvious) 

-3200 — E(B) (by computation) 

Therefore, RP(B) = E(B) — CE(B) « 0. 


4c. Everyone got 4c correct, showing a V function which was convex in the loss domain 
and concave in the gain domain. 


4d. Loss aversion means that losses will be more painful than the pleasure of an 
equivalent and symmetric gain. Graphically this means that the V function falls off more 
steeply in the negative domain around the status quo 0 than V rises in the positive domain 
around 0. In symbols: -V(-x) > V(x) for all x > 0. This means in particular that the DM 
will never like a gamble which gives even odds of winning x or losing x, since for tuis 
gamble, the expected “utility” would be .5xV(-x) + .5x V(x), which is negative (i.e., 
provides less than the status quo which has utility V(0) = 0) if—V(-x) > V(x). 


5. Most people drew the correct tree and computed that the optimal decision was to 
license the product, yielding expected gains = 200. The expected gains of not 
licensing the product were of course equal to 


Prob(HIGH COST)x E(PROFITS/HIGH COST) + 
Prob(LOW COST)xE(PROFITS/LOW COST) 


= (.7x[800] + .3x[200] - 400) + (.7x[800] + .3x[200]- 100) = 190 


The real problem people had was with part c which required that you add to the initial 
decision tree a prior decision as to whether to run a pilot project or not. The 
maximum the Company should be willing to pay will be the difference between the 
expected value if they run the pilot and when they don't (assuming this difference is 
positive). As the attached decision tree shows, the expected value of profits when 
they run the pilot is 260, so they would be willing to pay up to 60 = 260 — 200 at a 
maximum to have the results of the pilot. 


6. Not many people got this problem right for some reason. For part (a) the answer was 
any two of the behavioral inconsistencies which I had in my lecture notes and which 
Ayse Onculer also discussed. These included: 


MYOPIA (the tendency to discount near-term returns more heavily than long-term 
returns, so that only the near term counts); 


GAIN/LOSS ASYMMETRY (the fact that, as in prospect theory) people are risk prone in 
the loss domain and risk averse in the gain domain, and the fact that they tend to discount 
gains more than they do losses. 


For 6(b), I had expected you to present the Lowenstein and Prelec Theory very simply by 
writing down the formula for evaluating a stream of income/outlays X = {x(1), x(2), 
гаХ(Т) јаз; 


U(X) = X (0 V(x(t)) 


where the @ function drops quickly at 0 and then flattens out, and the V function has the 
form of question 4c. 


6c. The expression desired would equate the amount of the loan $600 to the stream of 
loan repayments (each at $200) over a very long (assume infinite) future. This gives rise 
to the expression: 


600 = [200/(1--R)] + [200/(1+R)’] + [200/(1-R)] + ... = 200/R 


so that the effective interest rate of the loan would have to be R = .33. Alternatively, if 
you assume that she has to pay 600xR in payment every year and she enjoys benefits of 
200 per year, then she would clearly break even if 200 = 600R, which gives the same 
answer. 


ба. There were lots of interesting ideas given here. Most importantly, the vap -:uuental 
evidence we studied suggests that DMs tend to be myopic and loss averse, and they are 
risk averse in the gain domain. To combat myopia, it would be important to arrange for 
the credit granting institution to be the same as the insulation installation institution (just 
like GM or Ford has a financing division for their cars). That way, this institution can 
put in the insulation and then have payments start after the homeowner has begun to 
enjoy benefits of reduced electric bills. There won't be any losses this way (avoided loss 
aversion) and the loan repayment can be structured to have larger payments later rather 
than earlier (by giving a suitable grace period before loan repayment begins). To combat 
risk aversion in the gain domain, it would be important to provide very credible evidence 
on similar homes of why the benefits are relatively certain (why the $200/year is “іп the 
bag") since if the homeowner believes they are risky they will have lower utility (since 
she is risk averse in the gain domain) and will therefore be less willing to undertake the 
installation and the associated debt burden. 


СТЕК 


DECISION TREE FOR PROBLEM 5c 
THIS SHOWS THE MAX WTP = 60 
At this value, the Company is just indifferent 
between running the Pilot and not running it. 
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Lecture Notes: OPIM 102 
Dr. Kleindorfer 
March 16, 1998 


Today’s High Points 


Brief Discussion of Midterm Examination 
Group Problem Solving 


e Background: Why Groups 
e Delphi Procedures 

e AHP & Expert Choice 

e Groupware 

e Value-focused Thinking 


Figure 6.1 
















TASK REQUIREMENTS 
AND RESOURCES AVAILABLE 
TO THE GROUP 


PROCEDURAL AND 
STRUCTURAL INTERVENTIONS 


GROUP PROCESS 


* Communication 
* Cooperation 

* Trust 

* Responsibility 







GROUP MEMBERS 
* Group Size 
* Homogeneity 

* Knowiedge 

* Values 







GROUP PERFORMANCE 


* As judged by Group members 


* As judged by others 
е Process Efficiency/Losses 
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FiGURE 1. Flowchart of Delphi in Sales Forecasting 
Data Feed-In 
(Numerical & Graph) 






Questionnaire 


| 







Formulation of First 
Round Questionnaire 


Expert Panel | 
Selection 







| Distribution and Collection 
of Responses 







Statistical Analysis 


Formulation of Second Е 
Round Questionnaire | 






| Distribution and Collection 
of Responses 







| Edit Relevant | Ен Data Requested for 
Statistical Analysis | Search, Collect, Edit 


Formulation of Third m 
Round Questionnaire | 


Distribution and Collection | | 
of Responses 


Statistical Analysis — 





Final Estimation. 
and Circulation 





Analytical Hierarchy Process (AHP) 
* [.L. Saaty (1976) 

* Many Applications 

* Public Sector 

* Private Sector 


* Versatile GDSS | 





Table 4. City Rankings with Respect to Cost of Living 











Cost of Living Percentage above Inverse of Normalized 
Index (COL) Minimum COL Percentages Inverse Percentages 
Boston |. 335.1 1.392 . 0.7184 0.31836 
Los Angeles 345.1 4.418 0.2263 0.10031 
St. Louis 330.5 1.000 1.0000 0.44315 
Houston 341.1 3.207 0.3118 — 0.13818 
Е | | 2.0302 | 


Table 5. Scale of Measurement for AHP 


[Numerical Values 


Equally important or preferred 
Slightly more important or preferred 
| Strongly more important or preferred 


Very strongly more important or preferred | 
Extremely more important or preferred 
Intermediate values to refiect compromise 
Used to reflect dominance of the second 

| alternative as compared with the first. 





Table 6. City Comparison with Respect to Climate 
















Pairwise Comparisons | 
Los Angeles | St. Louis | Houston | Relative Priority | 
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Table 7. City Comparison with Respect to Elementary and High Schools 


~ Pairwise Comparisons 
| Los Angeles | St. Louis | 
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Table 8. City Comparison with Respect to Colleges and Universities 


~ Pairwise Comparisons | | 
Возїоп | Los Angeles | St. = | = Relative пошу | | 


Los Angeles | 1/2 ——— кин > жан 
| St. Louis 1/5 шап: и иш ии 398 каш 
| 0.079 





Table 9. City Comparison with Respect to Commuting 
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Table 10. City Comparison with Respect to Arts and Recreation 


_ Pairwise Comparisons 
| === Los Angeles | St. Louis | Houston | Relative Priority | 
| Boston 1/2 НИ Е oa 
Паша а НЕ с — 





Table 11. Comparison of Subcriteria 


| Ec -High Schools 
| Colleges and Universities 


| Commuting 
| Arts and Recreation 





Table 12. Comparison of Criteria with Respect to the Goal 
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Table 14. Composite Priorities of the Cities 


Distance Cost of Living Climate Education Quality of Life | 
| (0.157) _ (0.107) (0.302) (0.088) (036) 








Projection Screen | White Board 





(MWS) Facilitator’s 
Workstation 


Projector 


Г] оа 


Ргїпїег Figure 6.4 
Typical GDSS Facility 


* 


TeamFocus Process 


TeamFocus:. 
Process: 


» Pre-Session 
Planning 


• Guidance/ 
Consultation 


' œ Creative TeamKit/2 
Application 


» Focus on Content- 





TeamFocus uses a verv pasic process 
which includes the following elements: 


Pre-session planning—ieamGuide апа 
meeting initiator meet to plan the upcom- 
па meeting 


Together the TeamGuice and meeting 
initiator: 

- Identify objectives 

- Determine appropriate TeamKit/2 tools 
- Structure the agenda 


The agenda is then sent to participants 


TeamGuide is creative in applying 
TeamKit/2 


The TeamGuide 15 skilled in helping 
customers identify objectives to be 
addressed in the meeting and developing 
a road map to meet those objectives 
using TeamkKit/2. 


On the day of the TeamFocus session, the 
participants. the meeting initiator and the 
TeamGuide assembie in the TeamRoom. 


= Benefits= 
During Session- 


_* Improved 
- Communication 


`e Effective Discussion 


of Sensitive Issues 








Outgrowth 


= Shorter Meetings 
.* More Effective Results 
” Team Ownership 


* Team Building 


Together they engage in a structured 
approach to resolving issues, identifyino 
opportunities and generating ideas or 
plans. The team gains leverage from the 
following characteristics of TeamFocus: 


» Anonymity 

* Focus on content. not personalities 

‚ Equal participation 

• Parallel and simultaneous communication 
* Complete record of meeting 


• Skilled. neutral TeamGuide 


TeamFocus users report these benefits: 
• Shorter meetings 
* More effective results 
• Team ownership 
• Team рипата 
• improved communication 


• Effective discussion of sensitive issues 












Facilitator’s 
Workstation 





Projector 


Printer Figure 6.4 
Typical GDSS Facility 


Sample TeamFocus 
Session 


The versatility and flow within a 
TeamFocus session is limited only by the 
creativity of the TeamGuide and meeting 
initiator, TeamGuides are called upon to 
snare tner experience and explore varia- 
поп tor each session 











Fiectronic Brainstorming 


What inhibits our customers from feeling 
delighted with our products and our services? 





Team ideas 





Idea Organization 


From Electronic Brainstorming, pull out "key" 
inhibitors to achieving customer delight. 





— Vote—Alternative Evaluation 


Rate inhibitors against the following criteria: 
1. Resources necessary to overcome inhibitors 
. Degree of negative effect on customer 


n3 





Team Priorities 





Topic Commenter 


Using the top 3 inhibitors identified in 
Alternative Evaluation. generate ideas on how to 
overcome the most significant inhibitors. 





Team Specification 





Policy Formation 


To enhance the "delight" factor with our 
customers. who in this room should do what 
and when? What are the next steps? Be 
specific. 


The Framework of Value-Focused Thinking 


the strategic 


set of means objectives 






Strategic objectives of 
the decisionmaker 
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а. alternative-focused thinking with alternatives A. В. and C 


set of means objectives 


fundamental objectives for the 
specific decision context 


the strategic 
decision context 








‘Strategic objectives of 
the decisionmaker 


b. the decision frame based on value-focused thinking 


Figure 2.7. Contrasting alternative-focused thinking and value-focused 
thinking for the same decision frame 


Table 2.1. Comparing sequences of activities with alternative-focused and 
value-focused thinking 


Alternative-focused thinking for decision problems 


|l. Recognize a decision problem 
2. Identify alternatives 

3. Specify values 

4. Evaluate alternatives 


5. Select an alternative 


Value-focused thinking 








For decision For decision opportunities 
roblems mM "— 
| RA Before specifving After specifying 
strategic objectives strategic objectives 
1. Recognize а deci- l. Identify a decision 1. Specify values 
sion problem opportunity 
2. Specify values 2. Specify values 2. Create a decision 
opportunity 
3. Create alternatives З. Create alternatives 3. Create alternatives 
4. Evaluate alterna- 4. Evaluate alterna- 4. Evaluate alterna- 
tives tives tives 
5. Select an alter- 5. Select an alter- 5. Select an alter- 


native native native 


identifying and Structuring Objectives 


minimize loss of 
life to children 


minimize 
loss of life нетер 
minimize loss of 
life to adults 









| minimize serious 
| injuries to children | 





minimize serious 
injuries 





minimize serious 
injuries to adults 






minimize minor 


injuries 





a. a fundamental objectives hierarchy 


munimiza 






dnving under 
influence 
of alcohol 





minimize 
acodems 












Maximize use of 
safety features 
on vehicles 












motivate 







8 purchase of 
“loa | safety features | 
on vehicles _ 





D. a means-ends objectives network 


Figure 3.1. Objectives structures for the safety of automobile travel 
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Insights for the Decisionmaking Process 


thinking about values 


* identifying and structuring 
objectives (chapter 3) 
* measunng objectves (chapter 4) 
| * quantifying objectives (chapter 5) 


* uncovenng 
hidden 


objectives 
(chapter 6) 





improving the decisionmaking 
process (chapter 10) 

‚ creating alternatives | | *guiding information collection 
(chapters 7 and 8) "| *evaluating alternatives 

identifying decision | * interconnecting decisions 
opportunities (chapter 9) | | *improving communication 

* facilitating involvement in multiple- | 
stakeholder decisions | 

* guiding strategic thinking 






better consequences 





note: an arrow means “leads to" 


Figure 10.1. The influence of value-focused thinking 
on the processes of decistonmaking 
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Table 11.4. Objectives hierarchy of the government panel 


System flexibility 

Institutional 
Timely licensability 
Adaptability to regulatory changes 

Technical 
Retrievability (e.g., from the monitored retrievable storage facility) 
Durability of cask 
Handleability (easy to lift. etc.) 
Independence of type of repository 


Costs 
Direct economic costs 
State government costs 
Federal government costs 
Uulity company costs 
Cost impacts on other parts of the svstem 
Indirect economic costs 
Costs for state and local responses to system 
Court costs, regulatory costs. etc. 
Road maintenance costs 


Health and safety impacts 
Radiation exposure 
To the workers 
To the public 
Transportation accidents 
To the workers 
To the public 
Other accidents 
To the workers 
To the public 


Environmental impacts 
Groundwater contamination 
Roadbed damage 
Visual impacts from storage 
Land resources for plants 


Political impacts 
Public confidence in 
Government 
Nuclear industry 


Fulfill government commitments 
Public acceptability 
Equity of risks | 
Among groups (public, transportation workers, industry workers) 
Geographical 





Table 11.5. Objectives hierarchy of the public interest panel 





Health and safety impacts 
Radiation exposure 
To the public 
To the workers 
Transportation accidents 
To the public 
To the workers 
Future generations 
Genetic effects 
Cancer 


Environmental impacts 
Land use impacts (e.g., storage facilities) 
Impacts on biosphere (from radiation release) 


Political and instituuonal impacts 
Resilience against 
Regulatory changes 
Major political changes 
Reduce need for regulation/inspection 
Political acceptability 
Increase trust and credibility 
Practical implementauon 


Fairness and equity 
Equity between risk bearers and beneficiaries of nuclear power 
Equity between present and future generations 
Liability and compensation 


Psychological concerns 
Fears and anxieues 
Assurance of a compensation system 


Costs 





uot roO ~ те fundamental objecuves hierarchy 


1. Health and safety impacts (P) 
Radiation exposure 
To the public (PGT) 
To the workers (PGT) 
Transportation accidents 
To the public (PGT) 
To the workers (PGT) 
Future generations 
Genetic effects (P) 
Cancer (P) 


2. Economic costs (G) 
State government costs (G) 
Federal government costs (PGT) 
Lulity company costs (PGT) 


з. Environmental impacts (С) 
Visual (G) 
Land use (PG) 


4. Political impacts (G) 
Public confidence in the technical system (PG) 
Public confidence in government (G) 
Local and state attitudes (GT) 


Ot 


. Social impacts (PT) 
Fears and anxieties (P) 
Transportation system inconvenience (PT) 


6. Fairness (PG) 
Equity 
Transportation workers, industry workers, public (G) 
Geographical (G) 
Beneficiaries of nuclear power (P) 
Intergenerational (P) 
Liability (P) 
. Scheduling (T) 
Timely availability of system (GT) 
Ability to handle appropriate quantities of spent fuel (T) 
8. Flexibility (T) 

Technical with respect to 
Consolidation of spent fuel (T) 
Reprocessing (T) 

Plant types (T) 
Retrievability (G) 
Repository media (GT) 

Institutional with respect to 

’ Transport regulation changes (Т) 
Regulation changes (PGT) 
Political changes (P) 


Note: P. С. and T stand for the public interest, government, and technical panels, 


Treen nm Pd = Т 


Table 11.9. Combined objecuves hierarchy concerning air pollution in 
the Los Angeles Basin 


1. Public health and safety 
2. Due to air pollution 
3. Impairment of lung function (e.g. chronic breathing diffi- 
culty) 
4. Children 
5. Elderly 
6. Other (e.g. asthmatics) 
. Heart attacks 
8. Cancers 
9. Lung 
10. Skin 
11. Acute health effects (e.g. difficulty in breathing. coughing, 
eve irritation, headaches, nausea) 
12. Reproductive effects 
13. Due to air pollution control 
14. Fatalities (e.g. due to vehicle accidents, less income, side ef- 
fect of medication. transmission of disease, workers' accidents) 
15. Iliness 
16. Injuries 


17. Psychological impacts (e.g. fear, depression, embarrassment concern- 
ing surroundings, stress, reduced mental acuity and concentration) 


18. Visibility 
19. In Western L.A. Basin 
20. In Eastern L.A. Basin 
21. In mountains east of L.A. Basin 


22. Lifestyle impacts 
23. Enjoyment of the environment (e.g. affected by visibility, smell, 
deterioration of materials and products, environmental degrada- 
поп, limited physical activity) 
24. Less convenience 
25. Restrictions on produce use 
26. Restrictions on vehicle use 
27. Restrictions on physical activity 
28. Restrictions on horne use 
29. Limitations on where to live 
30. Forced work at home 
31. Degraded personal relationships 
32. Within families 
33. With others 


Table 11.9. (continued) 
34. Environmental impacts 
35. Local 
36. Tree loss in forests 
37. Degradation of private gardens, plants, and fruit trees 
38. Global (e.g. minimize L.A. contribution to Earth's warming) 


39. Social impacts 
40. Restricted upward mobility 
41. Social stability and crime 
42. Conserve older neighborhoods 
43. Degradation of ethnic neighborhoods 


44. Economic costs 
45. Due to pollution 
46. To individuals 
47. Low-income 
48. Other 
49. To business and industry 
50. To local government (e.g. due to federal sanctions, du- 
plicity) 
51. Due to pollution controls 
52. To individuals 
53. Low-income 
54. Other 
55. To business and industry 
56. To local government 


—1 


O1 


. Socioeconomic impacts 

58. To individuals 
59. Lost jobs 
60. Fewer new jobs 

61. To business and industry 
62. Less productivity 
63. Fewer new businesses 
61. Closing of businesses 


65. Equity 
66. With respect to economic costs (e.g. who pays: rich vs. poor, іп 
L.A. Basin vs. out, big vs. little polluters, stockholders vs. rate 
pavers, balanced over time) 
67. with respect to benefits and negative impacts excluding costs 
(e.g. balance impacts over the enure L.A. Basin region) 


Table 11.10. Attributes for the assessment of value tradeoffs amon 
concerning Los Angeles air pollution 


Levels 
DEN ci) SARIN OE 


g objectives 


Attribute Good Bad Unit 
———————————,. "Rit 


Public health and safety: Annual num- 
ber of otherwise healthv adults diag- 
nosed as having a 20% impairment 
in lung function 


Psychological impacts: Average annual 
number of days L.A. Basin residents 
Say thev suffer from psychological 
effects of air pollution 


Visibilitv: Average daily miles of max- 
imum visibility in the eastern L.A. 
Basin 

Lifestyle impacts: Annual percentage 
of “desired vehicle trips” that are 
banned 


Environmental impacts: Annual num- 
ber of households with noticeable 
degradation of private gardens, 
plants, or fruit trees 


Social impacts: Number of L.A. Basin 
residents without “upwardly mobile” 
job opportunities owing to pollution 
or pollution control 


Socioeconomic impacts: Annual num- 
ber of jobs lost 


Costs: Total annual cost 


Equity: Equity with respect to distri- 
bution of cost. Constructed scale: 
WOrst Corresponds to low-income 
families (510К) paying same (e.g. 
$1,200) as high-income families 
(S30K): best corresponds to propor- 
Чопа! payments. e.g. low-income 
families paving $400. high-income 


0 


propor- 
uonal 


1,000 


100 


10 


20 


| million 


400,000 


100,000 


$7.2 billion 
equal 


] adult 


1 person-day 


] mile 


1 trip 


1 household 


1 person 


1 job 


$1 billion 


equal to pro- 
portional per 
household 


families S9.000 
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Summary Comments on Value Tree Analysis 


Thesis 1: One can identify the Decision-Maker(s) 
and Decision! 


Thesis 2: Stakeholders are willing to reveal their 
values! 


Thesis 3: Comprehensive Approach Required! 


Thesis 4: Quantification of Outcomes/Options is 
Useful! 


OPIM 102: Decision Processes 
Dr. Kleindorfer 
March 30, 1998 


Today 
e Review of Cournot 


e The Commons Problem and Applications Thereof 


Mixed Strategies and Other Static Game Theory Topics 


Problems Due on 4/1/98 


Next Time: 4/1/98 


Auctions in Theory & Practice 


Comparing Cournot-Nash Outcomes as n Increases 
Р(О)-а-О, Q=q,+q,+...+4,; Example: a = 100 
Duopoly: Both Firms with the same unit cost c; e.g. c — 10 
Each firm tries to maximize: 

u,(q) = Profits per Period, = [P(Q) - c]g;- Е (e.g., Е = 0) 
yielding the Best Response Functions (obtained by 
maximizing i's profit function taking what the other firm 


does as FIXED)s: 


b(q)—7[a-c-qi/5 1-1,2 мәлгро ју 


--- 
Cournot 
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Note: Welfare W = CS + PS = [(a-Py2]Q + (P-c)Q 
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Cournot Solution for Arbitrary n 
For any n, we have at the Nash Equilibrium 
qı (п) = (ан) 
Q(n) = n(a-c)/(n+1) --> (a-c) 
Р(О) =а-0О = (а + пе)/(п+1) --> с 
П, = [(а-с)/(п+1)р --> 0 
W = [n(n22)/2(n--1)][(a-c/2] --> ][(a-c)?/2] 


Thus, as n grows large, all quantities converge to the 
conditions associated with competitive equilibrium: 


e Marginal Cost Pricing 
e Zero Profits 
e Maximum Total Welfare 


Question: What solution would occur if all firms colluded on 
price and output? [Ans: Monopoly Solution | 


Question: What are the policy implications of this model for 
anti-trust policy (e.g., barriers to entry and collusion). 
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Dominated Strategies & Nash Equilibria 


Recall: Let G=<N, X, u> be a game in normal form and 
let i be any player in G. A strategy (or alternative) x; 1s 
(strictly) dominated by a strategy (or alternative) y; if for 
each feasible combination of the other players’ strategies, 
i’s payoffs from playing x; are (strictly less) no greater than 
175 payoff from playing y;: 

<= Strict daniwhan 


u,(X;, X4) € u(y; X) for all x; £ X, 


EXAMPLE: Prisoners Dilemma Game „жн э РУМ e 
у Fram pais ante С.у petty off 
C D у = abit What player 1 does 
C 35:4 0, 5 
D 3, 0 ], 4 


Theorem: In any Game G, Nash Equilibrium Strategies 
always survive iterated elimation of strictly dominated 
strategies. [Thus, one way of simplifying games is to first 
eliminate strictly dominated strategies. The resulting 
reduced game will not have “lost” any Nash equilibria in 
the process. | 
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Example: Eliminating Strictly Dominated Strategies 


Player 2 
LEFT MIDDLE RIGHT 
Player1 UP 1,0 127 0,1 
DOWN  0,3* 0,2 2,0% 
x Pareto points 


From | le 2 perspective E Қ”. 
idit domates ЖА; Ре саба Па көде” Whe 
Player 1 doos middie is В THe 09 

+ ит player | Pur оре ме 

Elumwe Down because IF РИ 2 goes 
Lett = Midd (6, | ut Fafaga у str Hy ойтой | 
Нот [ler 2 perspec ye, жі assuming 


p layer | chess Uf. = Midd le dominates 
Left. 
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Commons Dilemma Problems 
The Nature of the Problem 


There is a Strictly Dominating Strategy for everyone 
(which is therefore a Nash Equilibrium), but this Strategy 
is Pareto Dominated by another (cooperative) strategy. 
The cooperative strategy requires communication or some 
sort of sanctions against defectors to be implemented. 


Example (KKS: Page 250) Social Dilemma Game 


Number of Payofts to Payoffs to 
Cooperators Defectors ____ Cooperators 
5 12 

~ 20 9 

3 17 6 

2 14 3 

1 11 0 

0 8 


Note that for the game G = <N, X, и> with payoffs as 
above that there is only one Nash equilibrium and this 1s 
for everyone to defect. A better solution (for everyone!) 1s 
for everyone to cooperate. Note that the essence of the 
Commons Problem is that, no matter how many people are 
currently cooperating, it pays for each individual to defect. 
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More Generally: (See Gibbons, Page 27) 
а = <N, u, X^ 


where the behavior space for each player Х; is the space of 
nonnegative numbers (like goats grazing on the commons) 


N = the set of players (citizens living around the commons) 
u(x) = x, V(x, +... + X,) - cx, 


where V is the value per “goat” grazing on the commons 
and is assumed to be increasing but concave (that is, the 
more goats that graze the less each one of them is worth at 
market time). This gives rise to a situation in which each 
citizen has an incentive to put more “goats” on the 
common (with payoff V’ for each additional goat, but with 
a negative consequence for all the citizens, in the sense that 
the noncooperative solution will lead to less total welfare 
than 


Мах [ба FX) Уба Ћао ЊУ) CCK +... XI 
A good Example of this type of problem is associated 


with Global Warming and other International 
Environmental Initiatives. 
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Mixed Strategies 


Observation: Basic Theorem is the following. For any 

Game G = <N, X, u> for which X is a convex set, and ui 15 

a concave function in xi for every x-i, there is a solution. Nash Eg. 
An important application of this is to Mixed Strategies. 


Example: Consider the game of Matching Pennies (P. 29 
of Gibbons) 


Player 2 
Heads Tails 
q 1-4 
Heads p 4L, 1 1, -1 
-ра, pq p(1-9), -p(1-q) 
Player 1 
Tails 1-р l, -1 -1,1 
(1-p)q, -(1-р)а -(1-p)(1-q), (1-p)(1-q) 


Let payoffs be determined by strategies p = PROB РІ plays 
Heads = 1- PROB P1 Plays Tails; and q = PROB P2 plays 
Heads = 1 — PROB P2 Plays Tails. Let payoffs from Game 
be ui(x) = Expected utility of x, as shown in Matrix Above. 
Then what is P1’s Best Response Function? 
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What Should You Know from Rules of the Game, from 
KKS (Chap. 7) and from Chapter 1 of Gibbons? 


Intuition of various simple games in Rules of the Game and 
when equilibrium and Pareto solutions exist to these 


games. 


What is the definition of Nash equilibrium, Pareto solution, 
and Dominated solution? 


What is the procedure for sequential elimination of strictly 
dominated strategies? Understand that this will never get 
rid of Nash equilibrium strategies. 


How to derive Best Response Functions and what they are. 


Cournot Game and how the Nash equilibrium to this game 
is derived. 


What a mixed strategy is and what best response functions 
are for 2-person games with mixed strategies. 


The problems I assigned. 


The Commons Dilemma Game. 
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Lecture Notes: OPIM 102 
Dr. Kleindorfer 
April 1, 1998 
Today: 
e Mixed Strategies Revisited 


e Auctions in Theory & Practice 


Next Time: Jim Laing will lecture on Collective Choice 
Read KKS, Chapter 7, pp. 260-267 


April 8: Dynamic Games: Read the reading previously 
assigned for April 6 (Chapter 4 of Rasmusen’s 
Book) 


Mixed Strategies for the Negative Coordination Game 





X;-[04] Х28[0,1] 


u(pq) =3pq-p+pq-—qtpqt+l—p—qtpq 
= бра – 2p —2q +1 


uxp.q) = -1ра +3р(1-9) +2(1-р)а+ (1-р)(1-9) 
= -ра + 3p —3pq +24 pq +1 -p - q + pq 
= -5ра t2ptq +1 


Max [(69-2)р-29+1] 
p 


Max [(1-5p)q +2p+1] 
q 





0 2 | P 


uu " LE 
Question: When P is high, why should q be low? 


Introduction to Auction Games 


Dominating Strategy: Let a given game С = |М, S, U] be 
given. For player i, we define the set of dominating strategies 
D;(G) as follows: s; € Dj(G) iff u,(t,, s;) < us; s.) for all є $. 
and for all s; € S... 


Auction Games: Let there be a set of n players or bidders, who 
value an object to be auctioned as follows: 


О<т<а д8. Ж. а, 


where r is the reservation price to the seller, i.e. the minimum 
price the seller is willing to accept. 


Sealed First-Bid Auction Game: С = [N, S, 0] 
Let S; = [r, о), for allie N = (1,2, ..., п} 
Define for each s = S the winning bidder(s) as follows: 


w(s) = {1 = М such that s; = Max {s;; j = N}}, i.e., w(s) gives the 
player numbers for the winning bids at s = ($1, ..., s;). 


For every 1 = М, define u,(s) = a; - s; if i = Min (је w(s)} and 
ш(5) = 0 otherwise (could also assign by lottery in the case of 
ties). Thus, player i's utility is the difference between his value 
for the object and his bid, if he wins the auction, and otherwise 
IS Zero. 


Interesting Fact: In the First-Bid Auction (also the English 
Oral Auction), the "sincere strategy" a; dominates any s; for 
which a; € s;, so overbidding is not logical. However, it still 
pays to underbid. 


Sealed Bid Second Price Auction Game: G = [N, S, U] 


This Game is the same as for First Price Auction, except now 
the payoff to the winning bidder is the highest losing bid (1.e., 
the second highest price). Here the "sincere strategy" a; is 

actually a dominating strategy (1.e., а; е О (О) for every i = N). 
Why? 


SC 


a VES па: о -__- PUN: ___-_-_-_--) 


жа La SL 


0001027 — 002'989 uensuu) рений | Z 
000001, _ 000‘ | g'y 10188118101) 9. 
000°10# 0001214 AL H0M}ƏN АУ$ == 


00004 | айы 58 


000'L0t 000722 

0007409: |1 000282 | 

000407 22 000446% NIEVE 

pig puz ЕСТІ! БЕСТІ” | 
uonane зора риооәв рта-рэеэ$ SNOSULINUIIS v ш 








uonony oesuoorT HHA ZHI 8 puez мом 


6C 


рэзеэ[эл поцешлојш ом - 
sosuooi[ Ацеш оо) SUTUUIM Jo JOINT, - 
sioppiq Aq x1oAssons золпбом - 


бәле) поцешрлооо зојеоло piq-po[ess snoouvj[nurng ш 
000'c$ sem sud “000:001% Se^ рта ysy *eouejsur euo шу - 
suondoojod 
оцапа Әиіріедәл suio[qoJud sojea1o yoRoidde o»ud-puooog m 


:и818әр SIU Чим suro[qoJg 
Lo 
uonony әѕиә21 ЯНА ZHIN 8 pue[eoZ мом 


Ic 


орта мәй OU эле ary) поцм Spud поцопе oup № 
pozijyeuod mq *poyruuod әле вемлеірцил pig ш 


uornone ou) JO 9SINOD ou JOAO oSuguo sjuourodimnbo: Азат ЦЭ 
‘poonpal Азат vagy ло Айлцое Surpptq зиэтэцуа$ urejureui 
snu sJoppiq ‘$ 1504эр uo poseq Aq[enrur soaa 431141819 SIoppiq № 


әҳеш иеэ Аәц sprq jo Joquinu oq тш 11418119 sioppig ш 


рипој }Xou 
оф јој рта штштш әш souroooq (o3ejuooJod e зпја) prq ysty oy], m 


рипот Gove Joye poounouuge эзе зр ш 


spuno ш *jsnoouejnurs $11990 5128002 Jo sjosse (|е uo SUIPpIg № 





зәрін uonony ѕшриәэѕу snoouej[nuig 


СС 


рі 348 ци Код) yorum uodn Аповдео шпштхеш 
107 515—9 МУ-134 в ио sjrsodop ээиелре oprAoJd $ лорргд № 
ѕзәррід зшАијепб о) озивлре ur рэпа $тр 5329002 
o[es [e39| pue soJnso[ostp $.199$ “Зэларээола uonony m 
Joje[n391 o[qeorqdde 
ou) Jo 19||95 од Aq рэцтээ4$ yey) jo 1281] әш se рошјор 
ээп эллэ$эл в Aq рошеашозов oq 2Ц9ш Амодола цовц m 


әүеѕ 10у ролодо әд ртом 
sjuouropnuo зовдиоз pue sjosse 30019098 uozop үеләләс̧ ш 





Ajddng 190g эээ 
јој uornony Зшриәэѕу snooeuvj[nulrs ојашеха 


tc 
sjusuríed umop jruqns Ајоуетрошии —— 
ua) jsnuir OYA “рц YSIY Surpuejs шім 5лорр!у о) pros зођлодола ш 
pig ummuimurmi рәлпЬәл 
ƏY) se уйщ se рту Aou в вәлгәзәл Ајлодола ou пәцл вәбор попопу ш 
әрпі Килпәе UIOJJ ѕзәлтем pojrur pue зтемелруим рта мое ospe Aew samy - 
упошолош piq )әәш } иѕәор ләрріа Jr Ајојепоцлодола рээпрэл 
ѕрипол juanbosqns 10) Ату вуо yey} sambaa Ima Áa noy m 
(әс *Kes) juouro1our pojeugtiSop e pue piq зпоглэла Aue — 
314 әллә$әз s, Амодола әп — 
:Se 5ӘЙІЕ| SB 
Js€3] ЈЕ әд }5пш PUNO] цоғәш рід qoe `5лпоЦ ом] 10) 00155101005 
piq лоу подо цэвэ ‘пер spunod ому (pra 511894 uopony ш 





(penunuo)) uonon y ojdurexzq 


|V29'900'1298 — 


_|___мешебеа | АШ Г 000‘000'8E 


ee eT Р______ ЕЕЕ 


—————————————À 


јеџебеа | | 00000046. 
ET | _ 000'009' 27 
ЭТЕТ, 


000'000'08 

000'000'08 
| | 00000008 | 0508 | е7 
15 | зәиәбеа о 00000008 | 08-0 
јеџебеа | 000*000'08 
e 
Aysounn 


imines а eee ee 





ГЕТ, pig јеша 





шпдоәайс ормиоцем :usiso(T 3urpuaosy 
snooutj[nuiS UM зопоџоаха 304 


559 
P9 


OPIM 102: April 8 & 13 
Dr. Kleindorfer 
Dynamic Games of Complete Information 
Perfect Information Case 


+l. Player 1 chooses an action а, from the feasible set Aj. | где 
2. Player 2 observes a, and then chooses an action a; from A/e- 
the feasible set А.. 


3.  Payofts are u,(a,, a) and u,(a,, a). 


Backward Induction 


Since P2 observes P1’s choice before choosing a,, P2 will 
choose a, by solving 


МАХ иха,а,) 
a, 4, 


The solution to this problem is called the "best response" 
of P2 to P1 or the "reaction function" of P2 to Р1'5 choice. 
Denote this function by R,(a,), defined by 


u,(a,,R,(a,)) = ови и-(а,,а-) 


What should P1 do if P2 is known to choose using К.(а,)? 
МАХ u,(a,,R,(a,)) 


а, EA, 


4 Б. л 
Бул AVE +3 
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Stackelberg Model of Duopoly 


Two firms compete in a market. In the Cournot model both 
firms choose their output simultaneously (reacting to the 
previous output level of their competitor). In the Stackelberg 
model, F1 moves first in setting output а, then F2 chooses 
output q, after observing q,. 


Let the payoff to firm i be given by the profit function: 
04,4) = 4ІР(О) - с 


where the inverse demand curve P(Q) is assumed to be linear in 
total output О = д, + qə, i.e. 


P(Q) = P(q*4)-a-Q 





Backwards Induction: F2 solves for q, given q, 
MAX IL (q,.q;) - MAX 4:|4 -41:-%- c| 
4, 20 4:20 


which yields the reaction function (1.е., the best reponse 
function) | 
qı -e 


й = 
R,(q,) T р _ 


What is the difference between this Reaction Function and 
that of F2 in a Cournot Game? 


Now, Given F2’s (anticipated) Reaction, F1 solves 


ме П.(4,К.(4,)) = — 414 TNT R(q) ~ Я 
qz 3 G 


a- = С 
MAX 441426 
4120 2 


аа a a = 


Ф | ж * а-с 
qı = | 45 =К,(;) = n 








Compare Cournot-Nash and Stackelberg Outcomes 
Р(О) =a-Q Example: a = 100 


Both Firms with the same unit cost c; Example: c = 10 


а + 2c) x 
(a-c)/9 - 
| €di | 
(a + 2c)x | 
| " -c)/9 - 





Note: Consumers are better off under Stackelberg. Why? 


Note: Having more information (here F2 knows what F1 has 
done) does not make F2 better off. Why? 


Two-Stage Games of Complete but Imperfect Information 
Theory: Subgame Perfection 


1. Players 1 and 2 simultaneously choose actions a, and a, 
_. „from their respective feasible sets A, and А.. 


2. Players 3 and 4 observe the outcome of the first stage (a,, 
a,) and then simultaneously choose actions а; and a, from 
their respective feasible sets А; and A,. 


3.  Payoffs are ш(а,, a), a3, а.) for all i. 


Typically Players 1 and 3 and Players 2 and 4 are the same. 
But the idea here is that they can be assumed to be different 
for purposes of calculating the outcome. Essentially, we 
have two games: G, (involving P1 and P2) ana G; (involving 
P3 and P4). 


If players 1 and 2 anticipate that the second-stage behavior of 
players 3 and 4 will be given by (N,(a,,a,) and N,(a,,a,)) then 
the first-stage interaction between P1 and P2 amounts to the 
following simultaneous-move game: 


1. Players 1 and 2 simultaneously choose actions a, and a, 
from feasible sets A, and A, respectively. 


2.  Payoffs are u,(a,, a, N3(a,,a,), М.(а,,а.)), for i = 1,2. 


Suppose (a, aj) is the unique Nash equilibrium of this 
simultaneous-move game. Then we call 


а, а›, N4(a;, a5), М.(а,, а.) 


the subgame-perfect outcome of this two-stage game. The idea 
is that the outcome to the stage 2 game should be a Nash 
equilibrium given the outcome of the stage 1 game. If it is not, 
then it will not be a credible outcome (from P1 and P2's 
perspective) since P3 and P4 could not be expected to hold to a 
solution in stage 2 which is not an equilibrium. 


______Note: This same idea can be extended to games of any . | 


number of stages. 


Example: Two-stage Prisoner Dilemma Game 





чиним this is played twice with two players and that each 
player can see the result of the first game when playing the 
second game. 


What will be the result of the second-stage Game? The most 

logical choice is that it will be the unique Nash equilibrium 
_ Ф, D). Given this, the payoffs in the first stage game are of 

the form: 








What will the outcome of this stage-one game be and what is 
the meaning of subgame perfectness here? 


Another Example: Bank Runs 


Two investors have deposited D (each) with a bank. The bank 
has invested these deposits in a long-term project. If the bank is 
forced to liquidate the investment before maturity, a total of 2r 
can be recovered, where D > r > D/2. If the bank allows the 
investment to reach maturity, the project pays out 2R > 2D. 


If both investors make early withdrawals, then each receives r 
and game ends. If only one investor makes withdrawals, then 
that investor receives D and the other receives 2r - D and the 
game ends. Otherwise, the investment matures and a second 
stage game occurs with payoffs as shown. 


—Á— —_. 1 GAME | имања 


ШЕШИНЕ a T 
[DONT Go to Suge 2 


STAGE 2 


ия — DON'T 
meu R DD 
[DON'T ЕЕ: 
Backward Induction 

Game 2: В > D implies 2R - D > К, so that both 


investors "withdraw" is unique Nash equilibrium. 
. Game 1: "Don't-Don't" now has payoff R,R--Result?? 














Repeated Games 
Perhaps the most interesting recent results have been those 
"explaining" apparently cooperative behavior in economic 
settings. 


Recall that in PD experiments, we see cooperation increase as 
any of the following increase: 


e Communication (even unrelated communication) 


€ Acquaintanceship 





e Number of Plays of the Game 


Two Other Sources of Cooperation in PD-like Games (Marwell- 
Schmitt) 


| Cooperate | 


amm — | e Гаа 


€ Equity (As x increases, cooperation decreases) 














ө Interpersonal Risk (As x increases, cooperation 
decreases): IR experiment is a two stage experiment 
which gives P1 the option to push a "take" button if 
both players cooperate. 


Basic Theory of Repeated Games 


We imagine a sequence of repeated games, each with the same 
payoff matrix: 


2501 G3 G3 64... 


Question: What is an appropriate definition of equilibrium? 
Answer--Good Old Nash will do, but here it leads to very 
interesting results: 


Definition: Given a game G, let G(T) denote the finitely 

layed T times, with the outcomes -- 
of all eara plays observed before the next play begins. The 
payoffs for G(T) are simply the sum of the payoffs from the T 
stage games. беа ded 


Proposition: If the stage game С has a- unique Nash 
equilibrium, then for any finite T, the repeated game G(T) has 
a unique subgame-perfect outcome; the Nash equilibrium of G 
is played in every stage. 

Proof: Backward induction. 


Example: Prisoners Dilemma Game Repeated. 


Note: |. Multiple Nash Equilibria Possible 


Infinitely Repeated Games 


Let T approach infinity in G(T) but make payoffs be given by the 
present value of stage payoffs (with a common discount factor 6 
< 1), i.e. for player i 


INFINITE 
(1) +8 п(2)+8 П;(3)+...= Ж 6 пио 


Interpretation of 6: Either discounted monetary reward, or 
probability of continuing play to next 
A stage. 
© Trigger Strategy for PD: Play С; in the first stage. In the t^ 
> stage, if the outcome of all t-1 preceding stages has been (С, С) 
then play Ci, otherwise play D;. 


Result: Tit-for-Tat is a Nash equilibrium for G(*,8) under the 
following conditions (see payoff matrix below): 


T>R>P>S R> Sit Ту/2: 
and 





Why? 


Interpretation in Biology Setting: Tit for Tat is evolutionarily 
stable (1.e., efficient) if and only if the interactions between 
individuals have a sufficiently large probability of continuing. 


Strategies and Subgame Perfection 


Definition: In the finitely repeated game G(T) or the infinitely 
repeated game G(*,6), a player's strategy specifies the action the 
player will take in each stage for each possible history of play 
through the previous stage. 


Definition: In the finitely repeated game G(T)., a subgame 
beginning at stage t+1 is the repeated game in which G is played 
T-t times. There are many subgames that begin at t+1, one for 
each of the possible histories of play through stage t. 


Similar definition for infinitely repeated game G(*,8) except here 
each subgame at stage t is identical to the original game. 


Definition: (Selten 1965): A Nash equilibrium is subgame 
perfect if the player's strategies constitute a Nash equilibrium 
in every subgame (for whatever history might have 
materialized up until the subgame begins). 


Definition: Average payoff = (1 - 5)x Payoff. 


Theorem (Friedman 1971): Let G be a finite static game of 
complete information. Let e = (е), ..., €n) denote the payoffs 
from a Nash equilibrium of С and let x = (xi, ..., Ху) denote 
any other feasible payoffs from С. If x; > е; for every player 
i and if 6 is sufficiently close to one, then there exists a 
subgame perfect Nash equilibrium of the infinitely repeated 
game С(®,0) that achieves/x as the average payoff. The 
requirement on 6 is that,f for every i, 5 > (d; - x;)/(di - е), 
where 4; is the payoff/from deviating from e to the best 
alternative strategy for ijwhen all others play x. 


Чү 990^ CAride y) 
PD Game? 5 = Probopiliry of Ёлсоол# e С 
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OPIM 102: Decision Processes 
Dr. Kleindorfer 
April 13, 1998 


Plan for the Remainder of the Semester 
April 13: Continue Discussion of Dynamic Games 


April 15: An R&D Race: Case Study (See Syllabus 
for Class Discussion Questions) 


April 20: The Race to Develop Human Insulin (See 
Syllabus for Case Study Questions) 2-Page 
Write-up Due on this Date: Don’t Be Late! 


April 22: Discussion of GA Algorithm for PD Game; 
Hand in Write-up for Group Assignment 
(see attached assignment) 


On April 22, I will pass out also some review questions for 
the Final Exam. 


My office hours for Reading Week will be as 


e Monday April 27: 2:00 — 3:00 PM 

e Tuesday April 28: 2:30 — 4:00 PM Review Session 
(Room TBA) 292. 4PH 

Monday May 4: 1:00 — 3:00 PM 

e Wednesday May 6: 9:00—10:30 AM 


OPIM 102 
Instructions to retrieve and unpacking Course 
Materials for GA Assignment to your UNIX Account 


In order to retrieve the course materials, you will have to ftp to the anonymous ftp site at 
OPIM.WHARTON.UPENN.EDU. You do this via your futures or equity accounts. At 
your prompt, you will type: 


futures(~)% fip opim. wharton.upenn.edu 
Login Name: anonymous 
Password: youremail@hosiname.edu (if required) 


The file you are looking for is in the /opim/opim 102 directory it is called dilemma. То 
retrieve this file at the prompt type: 


ftp» са opim/opim102 

ftp» bin (This is to signify binary transfer mode) 

ftp> get dilemma 
At this point you will get a status bar or no prompt. The file will be transferring to your 
account during this time. When you receive a prompt once again. The transfer statistics 
will appear. Now that you have the file in your account, you will need to exit the ftp 
program. You do this by typing: 

Пр> bye 

This will return you to your Unix (i.e. futures, equity, etc.) prompt. 
From now on, when you want to run the program, you do it by typing: 

futures(~)% dilemma (and answer the questions the program asks you) 
If you would like to print the results to a file you will type: 

futures(~)% script (You will see the following appear on the screen) 

Script started, file is type script 

futures(~)% dilemma (You have to type that) 


Now when the program has finished, you will want to change the name of the output file 
from typescript to whatever you like. To do this type: 


futures(~)% ту typescript trial] (triall is name of the new file, you can choose апу 
name you like) 


saque QUTPuT Еби "иений" 


Script started on Sat Apr 11 12:09:19 1998 
futures (~) % dilewmml [0KO [0KO0O [ ОКтта 
\Payoff matrix for the prisoner's dilemma: 


C D 
© 3.0 0.0 
D 5.0 1.9 
Population size [30] ? 
Population size [30] ? 


Length of the history [3] ? 
| of trembling [0.000] ? 
ы of rounds [100] ? 

ИРИ of generations [25] ? 


әз ана ланы шыны „шшш шыш шыш: пышы oo ee ee ee eee c c eee ee a eee eee eee eee ee ee ee —— —— EE о 


Parameters for the genetic algorithm: 


Size of the population: 30 
Overlap between cycles: 29 
Chromosome size: 16 

Genome size: 16 

Allele size: 1 
Selective pressure: 2.000 
Sharing of resources: 0.000 
Mutation probability: 0.005 
Seed for random numbers: 892310985 


Evolution of cooperation 
cooperation, 
rcp = reciprocity of the opponent's previous move): 


(avg = average payoff, coop = frequency of 


avg coop rcp(C) rcp(D) best strategy 
Generation 0: 2.246 0.516 0.296 0.263 (id = 24, fitness 
= 2.704333) (CDDDDDCDDDCDCDDC) 
Generation 1: 2.198 0.504 0.312 0.303 (id = 49, fitness 
= 2.506000) (CDDDDDCDDDCDCDDC) 
Generation 2: 2.708 0.807 0.717 0.105 (id = 77, fitness 
= 2.993333) (CDDDDDCDDDCDCCDC) 
Generation 3: 2.821 0.886 0.839 0.067 (id = 104, fitness 
= 2.951667) (CCDCCDCCCDDDCDDD) 
Generation 4: 2.855 0.906 0.862 0.050 (id = 121, fitness 
= 2.991333) (CDDCCDCCCDDDCCDC) 
Generation 5: 2.907 0.946 0.923 0.032 (id = 97, fitness 
= 2.958667) (CDDDDDCDDDCDCDDC) 
Generation 6: 2.914 0.950 0.928 0.028 (id = 169, fitness 
= 2.943333) (CDDDDDCDDDCDCDDC) 
Generation 7: 2.920 0.954 0.935 0.026 (id = 212, fitness 
= 2.959000) (CDDDCDCDDDDDCDDC) ` 
Generation 8: 2.913 0.950 0.929 0.029 (id = 228, fitness 
= 2.960667) (CDDDCDCDDCCDCDDC) 
Generation 9: 2.913 0.951 0.929 0.029 (id = 284, fitness 


= 2.936667) 


(CDDDCDDDDDCDCDDC) 


Generation 10: 2.912 0.950 0. 
= 2.954333) (CDDDDDCDDDCDCDDC) 
Generation 11: 2.915 0.952 0. 
= 2.969667) (CDDDCDCDDCCDCDDC) 
Generation 12: 2.909 0.948 0. 
= 2.941333) (CDDDCDCDDCCDCDDC) 
Generation 13: 2.911 0.950 2. 
= 2.938667) (CDDDCDCDDDDDCDDC) 
Generation 14: 2.909 0.948 0. 
= 2.938333) (CDDDCDCDDDCDCDDC) 
Generation 15: 2.921 0.955 0. 
= 2.955333) (CDDDDDCDDDDDCDCC) 
Generation 16: 2.910 0.950 ü. 
= 2.936000) (CDDDCDCDDCDDCDCC) 
Generation 17: 2.910 0.949 0. 
= 2,952333), (CDDDCDCDDDDDCDCC) 
Generation 18: 2.917 0.953 0. 
= 2.937333) (CDDDCDCDDDDDCDDC) 
Generation 19: 2.912 0.950 0. 
= 2.941667) (CDDDCCDDDCDDCDCC) 
Generation 20: 2.919 0.954 0. 
= 2,949000) (CDDDDDCDDDDDCCDC) 
Generation 21: 2.911 0.950 0. 
= 2,932667) (CDDDDDCDDDDDCCDC] 
Generation 22: 2,915 0.952 0. 
= 2.936000) (CDDDCDCDDCDDCCDC) 
Generation 23: 2.918 0.954 0. 
= 2.945000) (CDDDCDCDDDDDCDDC) 
Generation 24: 2.914 0.952 
= 2.935333) (CDDDCDCDDDDDCCDC ) 


928 
931 
925 
929 
926 
936 
929 
927 
934 
930 
335 
929 
933 
935 


0.933 


|029 (id = 
„028 (id = 
„030 (id = 
„030 (id = 
„030 (id = 
„027 (id = 
.030 (id = 
.031 (id = 
. 028 (id = 
„030 (id = 
.028 (id - 
. 030 (id = 
.029 (іа = 
.028 (іа = 
.029 (id = 


274, 
360, 
FTL, 
335, 
438, 
471, 
510, 
415, 
488, 
540, 
519, 
660, 
667, 
650, 


142, 


fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 


fitness 


Population: jp (ew gy e«t? ecu aub The Five} Ceveva Ti ew 


^ (id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
(id 
‚(а 
(id 
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fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 
fitness 


fitness 
fitness 


y y w m m m m m н m m uy m m mw л 


uU mg mg m н "m P 


fitness - 


2.935333) 
2.934333) 
2.934000) 
2.927333) 
2.927333) 
2.925333) 
2.923667) 
2.922333) 
2.920667) 
2.919333) 
2.918667) 
2.918000) 
2.917000) 
2.915667) 
2.915333) 
2.915000) 
2.913333) 
2.913333) 
2.912667) 
2.912333) 
2.909000) 
2.908667) 
2.908000) 
2.905000) 
2.904000) 
2.902667) 
2.900333) 
2.898667) 
2.894000) 
2.883000) 


(CDDDCDCDDDDDCCDC) 
(CDDDCDCDDCDDCCDC) 
(CDDDCDCDDDDDCCDC) 
(CDDDCDCDDDDDCDDC) 
(CDDDCDCDDDDDCDDC) 
(CDDDCDCDDDDDCDCC) 
(CDDDCDCDDDDDCDDC) 
(CDDDDDCDDDDDCDCC) 
(CDDDCDCDDDDDCDCC) 
(CDDDCDCDDDDDCCDC) 
(CDDDCDCDDDDDCDDC) 
(CDDDCDCDDDDDCDDC) 
(CDDDDDDDDDDDCDCC) 
(CDDDCDCDDDDDCDDC) 
(CDDDDDCDDDDDCDDC) 
(CDDDCCDDDCDDCDCC) 
(CDDDDDCDDDDDCDCC) 
(CDDDCDCDDDDDCDDC) 
(CDDDDDCDDDDDCDCC) 
(CDDDDDCDDDDDCDCC) 
(CDDDCDCDDDDDCDCC) 
(CDDDCDCDDDDDCDDC) 
(CDDDCDCDDCDDCDCC) 
(CDDDCDCDDDDDCDDC) 
(CDDDCDCDDDDDCDDC) 
(CDDDCDCDDDDDCDDC) 
(CDDDDDCDDDDDCDCC) 
(CDDDCDCDDDDDCDCC) 
(CDDDDDDDDDDDCDCC) 
(CDDDCDCDDCDDCCDC) 


OPIM 102: Decision Processes 
Dr. Kleindorfer 
April 15, 1998 


Today 
e Questions about GA 
e Dynamic Games--Continued 


e Introduction to R&D Race 


Next Time: 11/20/97 


Answer Question from Syllabus in Two-page Write-up on 
R&D Case 


What light does the “model” shed on Eli Lilly’s launching 
and subsequent management of the race to develop human 
insulin? 


The Cournot Case: Repeated Game Setting 
We have n identical firms and a given price function 
Р(О) =a-Q, where Q=q, 504%... + д, 


We know that the Nash equilibrium for the static game is 
given by 


а, = (а—с)/(п+1), for alli=1, 2, ..., n; 
so that О = п(а — с)/(п+1) and P = (a + пс)/(п+1) 
Note: П= ((a-c)/(n+1))* = е, Nosh C qu^ 


What is the "collusive" outcome? Think of this as sitting 
around the Table and figuring out the max profit solution. 


Clearly the answer is the monopoly output and price. 
4; = (а – cy2n, foralli-1,2,..,1; 


so that О = (a — cy2 and P=(a+c)/2 
Note IT, = ((а-с)/2)*/п = x, > е. 


Can we make this be the Nash outcome of a repeated 
game? 


Repeated Cournot Solution (Part 2) 


e; = Nash, x; > е; = desired payoff. 


Сум Anak дете if everyone 
бісе Keep <946овілі 


а= пва), а) PONE fe № 
а= В(9.) = (a- c - Q/2 

= ((a — cy2) — ((n-1)/2n))((a-c)/2) 

= ((a-c)/2))((n+1)/2n) 
Plug this in to Profit Function to obtain 
di = П, (Bi(q.), q.i) 

= ((a-c)/2)? x ((п+1)/21)? > ((a-cy/2 x (уп) = x. 

Now use ће Theorem: "Y^s36- Бал). <> 
о (n) > (d; — x,)/(d; – е) = N(nyD(n) 
where 
N = [((nt1)/2n)? - (1/n)]((a-c)/2)? 
D = [((nt+1)/2n)? — 1/4(n+1)"]((a-c)/2)° 


Easy to see that О(п) — 1 as п — со. What does this mean? 
Collusion becomes more difficult as n gets big. 
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R&D Intensities 


Firm A is working on stage 





_ Firm ВЕ 
working 
on stage 
Table 2 
Expected Values (5 million) 
Firm A is working on stage 
II 
Firm В 15 
working 
on stage 


Table 3 


(NOTE: Entry in the upper nght hand corner of each cell is А s effort/value and 
entry in the lower left папа corner 15 B's effort/value .| 
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Evolution of the Race 


Firm A is working on stage 
I П ПІ 





Firm В is 
working 
on stage 
Table 4 
Chances of Winning the Race 
Firm A is working on stage 

Firm B 15 

working 

on stage 





Table 5 
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Hints for Studying for the Final Examination 
OPIM 102 
Dr. Paul Kleindorfer 
April 22, 1998 


The exam will be closed book. It will take place from 8:30 to 10:30 am on Friday May 8, 
in Room 103 SH-DH. Please do the following to review for the examination: 


Review all class notes from the beginning of the course. These will be a good indication of 
the kinds of questions I am likely to ask. 


Be prepared to answer questions of the sort I list further on in this handout. 
As always, it usually helps to study in groups to prepare for finals. But make sure you 
understand the answers to each of the questions. It is your learning process which I will 
test in the final examination. 
I will be holding office hours on Monday April 27 from 2 to 3 pm, on Tuesday April 28, 
from 2:30 through 4 pm (a review session in Room 202 SHDH), on Monday May 4 from 1 
to 3 pm and on Wednesday, May $^ from 9 through 10:30 pm. Please come. 
6 
Good luck in your preparation. 
Typical Questions 
1. What are the axioms of EUT? Are these always satisfied by every DM? In what sense 
do they embody the concept of rational choice? See KKS, Chapter 4. 
2. Certainty Equivalent and Risk Premium: Suppose a DM has the following utility 
function 
U(x) = Мар, .5х+40] 

Show that U(x) is concave with a kink at x = 80 (plot it!). Thus, we know that the DM 
with this utility function is risk averse. Illustrate this by computing for W = 0 and the 
lottery A: 

А = [.6, 50; .4, 100] 


E{A} = .6(50) + .4(100) = 70 


СЕ(А) (= 66) and RP(A) (= 4 > 0) 


Note: U(CE(A)) = .6U(50) + .4U(100) = .6(50) + .4(90) = 66 = U(66) 


3. Maximum Willingness to Pay (WTP) fora DM: The Maximum a DM is WTP to play a 
lottery is found by considering the impact that this payment will have on the ultimate 
outcome of the lottery. Thus, if the DM pays 10 to play the lottery 


А = [.6, 50; .4, 100], 


then this is in effect the same thing as playing the lottery A-10, obtained from A by 
reducing every outcome by 10: 


A-10 = [.6, 40; .4, 90] 


To determine the maximum which the DM should be WTP to play a lottery like A, we 
simply find the largest payment the DM could make and still not be worse off than doing 
nothing. If W is the DM’s initial wealth level, this amounts to solving the following 
equation: 


ОСУ) = E(U(W + А - WTP)} 
Consider the lottery A above and assume the DM has preferences which can be represented 
by an exponential utility function U(x) = -exp[-cx], where x = the DM’s degree of risk 
aversion. Write an expression characterizing MAX WTP(A). Write another expression 
characterizing CE(A) and show (if you can) that CE(A) - MAX WTP(A). Can you explain 
why this is so? Do the same thing for U(x) = x^ and show when W=100 that CE(A) > 
WTP(A). 
-exp[-cW] = -.6 exp[-c(W--50-WTP)] - .4 exp[-c(W+100-WTP] 
-ехр|-с(УУ--СЕ(А)) = -.6exp[-c(W+50)] - .dexp[-c(W--100)] 


U(x) = SORT(x), W = 100 — CE = 69.14, WTP = 68.57 


4. Minimum Willingness to Accept (WTA) to play a lottery. In a similar fashion, if the 
lottery is not a desirable one, and you are given a free choice whether or not to play it, then 
you may have to be paid something to play it. Consider the lottery | 


В =[5, -50; .5, 0] 


Then if you аге paid something to play B, say you are paid 10; this has the: effect of adding 
10 to each outcome, resulting in the lottery B+10: 


B+10 = [.5, -40; .5, 10] 

The minimum (WTA) you should accept to be paid to play B would be the amount which 
would just make you indifferent between the lottery + payment and doing nothing. If W is 
your initial wealth, this results in the equation: 

U(W) = E(U(W +B+ WTA)} 
Suppose you are risk neutral (1.e., you have the utility function U(x) = x), what is ie. | 
minimum WTA you would accept to play the lottery B given above? In this case, show 
that this is the same as -CE(B). Would this change if U(x) = -exp[-cx]? If W=100 and 
U(x) =x°? Does WTA(B) = -CE(B) always? 
ANSWER: This holds for exponential, but not for SQRT. For example: 


-exp[-c100] = -.S5exp[-c(100-50+WTA)] - .Sexp[-c(100-0-- WT A)] 


or 
2 = exp[-c(WTA - 50)] + exp[-cWTA] 

Similarly we get the same expression for CE(B): (522 „НАЯ ›-|ч‹э 

-ехр[-с(100 + CE(B)] = -.Sexp[-c(100 - 50)] -.5ехр[-с100] 0 = 7 doe mios 

or 


2 = exp[-c(-CE(B) - 50)] + exp[-c(- CE(B))] 


5. Portfolio Choice: Suppose a DM with initial wealth W and an investment budget B can 
invest in n different securities, i = 1, ...,n. Suppose the return from each security is x;R;, 
where R; is a random variable with a known distribution. Then the portfolio choice 
problem can be represented as follows: 


Maximize E{U(W+ >, xR;)} 
i=] 
Subject to: 2. X; € B, x; > 0, for alli. 


p=] 


What do you think happens to the investment/portfolio choices of the DM if the DM is 
very risk averse? Le., what happens in the case U(x) = -exp[-cx] if c is large? 


ANSWER: DM starts moving to a less risky portfolio. If we assume normal returns, 
then we can actually plot EU contours and the efficient frontier for different 


portfolios as a function of the mean and standard deviation (or sigma) of the 
portfolio. 


What Should You Know from Rules of the Game 
and from Chapter 1 of Gibbons? 


Intuition of various simple games in Rules of the Game and when equilibrium and Pareto 
solutions exist to these games. 


What is the definition of Nash equilibrium, Pareto solution, and Dominated solution? 


What is the procedure for sequential elimination of strictly dominated strategies? 
Understand that this will never get rid of Nash equilibrium strategies. 


Best Response Functions. 
Cournot Game and how the Nash equilibrium to this game is derived. 


What a mixed strategy is and what best response functions are for 2-person garnes with 
mixed strategies. 


The problems I assigned (but not the case of three candidates for problem 1.8). 


The Commons Dilemma Game (see also experimental results in Chapter 7 of KKS) 


What you should know about Dynamic Games 


Perfect Information and Backward Induction as in the case of the Two-Stage Prisoner 
Dilemma Game and Bank Run Game 


Stackelberg Model of Duopoly (see class notes) 


What a Trigger Strategy is and when it is a Nash equilibrium to the repeated game. 
Other Questions of Interest 


1. The Theory'of Argument е 
What is Toulmin's Theory of Argument? Briefly describe this and give an example 
of its use. 


2. Prediction and Inference 


a. Define and give brief examples of two key descriptive features of judgement 
under uncertainty which reduce effectiveness and accuracy of юу Шай» 1 ш prediction and 
inference tasks. 


b. Indicate for the two descriptive features you have defined i m (b) prescriptive 
approaches to improving decisionmaker performance. WS 


3. Valuation and Choice: General 


a. State the Allais and Elsberg Paradoxes and indicate’ why they- аге ‘not 
consistent with Expected Utility Theory. 
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b. Give an example of preference reversal. Show explicitly whether or not 
Prospect Theory provides a descriptive. theory which is consistent with y6ur example. - 


4. Valuation and Choice: Specific 


The typical consumer of automobile insurance may not be well informed about the 
risks that she faces. Suppose such risks are expressed as a chance event of having an accident 
with economic costs L (if it occurs) with probability of occurrence p per year. 


a. How much should the consumer be willing to pay in annual premiums if she 
is а von Neumann-Morgenstern EU maximizer and is completely informed of p and L. 


b. Suppose that the typical consumer knows p but that she is uncertain as to the 
precise value of L, but thinks it to be in the range of L, to L, with L, < Г, and the actual 
value L satisfying L; < L< L;. Describe а desriptive model of choice which the consumer 
might use to determine the maximum premium she would be willing to pay. Explain how 
you would validate your descriptive model. 


; С; Indicate how an expected profit-maximizing (i.e., risk neutral) insurance 
"company might actually set premia to insure consumers who behave like the one whose 
decision process you model in (b). 


‚4. Would it be in your interest as an expected profit-Crnaximizing insurance 
company to educate consumers of the type you describe in (b) as to the actual value of L? 


^ 5. Decision Trees 


You should know how to draw simple decision trees, how to determine the optimal 
decisions by backward induction and how to compute the value of information when pilot 
tests or the like can be run to provide better estimates of costs, profits or probabilities. 


6. Legitimation Theory 


= 


| In Section 5.4 of KKS, there is a description of several different orientations to 
prescriptive. solutions. How might these different orientations be useful in defining a 
„1 prescriptive: approach in a.group setting, eg. a quality team attempting to improve a 
- " particular organizational process? | 


7. Valuation and Choice 


Earthquake insurance companies have begun to give rebates (in the form of premium 
discounts) to property owners if they undertake certain mitigation measures on their-homes 
(е.2., bolting their home to its foundation). This is done to promote the:use of such 
mitigation measures and also because the insurance companies (and consumers) save money 
in the event of an earthquake because the mitigationmeasure réduces/losses. This question 
concerns how much of a discount is required in order to-induce property.owners to undertake 
a certain mitigation measure. 


a. Suppose a homeowner knows (or thinks) that the probability of an earthquake is p 
= .01 per year and that the loss to the homeowner if one occurs is L:= $100,000. Write an 
expression (you do not have to solve it!) which characterizes. how much ‘the homeowner 
should be willing to pay in annual insurance premiums. for: complete: coverage if she is a 
(risk-averse) von Neumann-Morgenstern EU maximizer and is completely informed of p and 
І? You may assume that the homeowner has initial wealth W^ $100;000. Hint: what is the 
homeowner's certainty equivalent of the lottery [p, L; (1-p), 0]? ye ме oua can use the 
exponential utility function to express your answer. 


bi Now suppose the homeowner believes that losses will be:affected by whether or 
not mitigation is undertaken before the fact. In fact, the homeowner thinks. that in the event 
of an earthquake losses will be $50,000 if mitigation is undertaken before the earthquake. 
Assume the same probability of an earthquake as in part (a) (p = .01). If mitigation costs 
$400 per year (think of this as paying off a loan to the bank for the total cost of the 
mitigation). What is the maximum insurance premium which the homeowner should be 
willing to pay if she undertakes mitigation (and pays for it herself). | 


‘What: 15 the minimum reduction in insurance premium: ай Ше nme 
would ice (again for full coverage) in order to convince her to install the mitigation 
measure on her home. 

d. Note that the maximum such an insurer will offer (in annual premium reduction) a 
homeowner who installs a mitigation measure of the sort described will ibe the actuarial value 
of the reduction in loss, i.e. .01x(reduction in loss) = .01x$50,000 = $500. ^:Assume that this 


2 Jis greater than the minimum amount the homeowner would require tócadopt mitigation. Why 


might an expected profit-maximizing (ie., risk neutral) insurance company? offer! greater 
premium reductions than the minimum required by homeowners? 


8. Group Choice 


= . Discuss what the "conformity bias" (the Asch experiments) is and why it is an 
important element of Group Problem Solving. How would such a method as the Delphi 
Technique (used for Group Judgment and Forecasting) partially address the Conformity 
Bias? 


9, Cournot Games 


Consider a "linear duopoly" with price (or inverse demand function) P(Q) = 20 - Q = 20- 
qi Ф: and with cost Tor both firms ofc = 1. 


a. а the reaction functions (also called best response en for both firm 
1 and firm 2. 


b. Describe the Nash equilibrium in terms of these reaction functions. 


i 10. Commons Dilemma Game (the n-person Generalization of the PD Game) 


Consider the following 5-person generalization of the Commons Dilemma Game based on 
.KKS, Chapter 7. There are 5 players and each player has two strategies, either cooperate 
| (C) or defect (D). (An example of this in real life is that you can either put an air 
emissions control device on your car or you can "disconnect" it, leading to slightly better 
gas mileage, but also to pollution. You can thus either cooperate and leave the device on 
or you can defect and disconnect it.) 


Let x be any collective behavior for this game (i.e., a 5-tuple of C's and D's representing 
the choices of the 5 players). The payoffs to each of the 5 players at x are of the form: 


Ох) = 4N(x) if x; = C; 
and 


U(x) = 28 - 4[n - N(x)] if X; = D; 


where N(x) is the total number of players cooperating at х (1.е., the total number of С: 'si in 
the vector x), so that n - Ма) 1 is the number of defectors at x. The payoffs are given 
below. 


. Payot offs fs (in $'s) for Commons Dilemma- Problem when n=5 


| Number C Cooperating Payoffs to Defectors. | Payoffs ! to Cooperators 


Not Applicable ==: 








a. Show that the unique Nash equilibrium to this fame involvés ` everyone defecting. 
Show that this equilibrium is "stable" in the sense that no matter how many people 
are now cooperating, it will be in each player i's interest to defect, 1,е., defection is 

a “dominaiit strategy" for each player. 


b. How does age egate output (the sum of the dollar payoffs to each player) compare 
uidet е Nash equilibrium relative to socially optimal output (the cooperative 
ышан) In dn sense does this game represent a "social dilemma"? 

5621062 Based on the results of experiments with "real people", when and when not does the 
bod _ (inferior) Nash equilibrium tend to obtain in experiments and when does the socially 

| (or Pareto) optimal "cooperative" equilibrium tend to obtain? 0 


11. Repeated Games 


Consider à two-person game G, where the strategy set for Player Tis {a,b,c} and the 
strategy set for Player.2 is {x,y,z} with the following payofts: | 





а. What strategies, if any, are Nash equilibria to G when played as a one-shot game? 


b. Suppose now that G is played as a repeated game with a discount factor § < 1 for 


- м PE d utin Ша -— AN quat tao pe eee | НЫ песы, 


~ 


both. players and with nidi in ü the. u— game equal to the discounted sum of 
Tad in each sub c Give one Nash equilibrium to this рене game. t 
БЯ %% 
Ё С; i "Recalling the ‘repeated game Theorem (see below), what conditions rust obli. Mt 
АТҚ order to insure that-the payoff (10,7) obtains (at every stage) as an a are to a 
ра; trigger strategy? қ 


istr -Theorem (Friedman 1971): Tu 
ка c Let С be a finite static game of complete information. Let e = (ei, ..., е) denote the” 
payoffs from а-Мавһ equilibrium of С and let x = (X;, ..., Хр) denote any other "feasible 
payoffs from О. 1t > ©, ff бг eVery player i and if § is sufficiently close to one, then there 
exists a subgame; perfect. Nash equilibrium of the infinitely repeated game G(w,8) that 
achieves x.as the average. payoff. ‘The requirement on 6 is that; for every i, 


— 8 > BMG -е),_______ лаша 
! “where d; is the payoff from C from x; to the best реге strategy for i when all 
| os Re Play x LU Re A 7 


в What would н to ‘the restriction on the discount factor біп (с) above if the . | 


| | | payoff from the collective strategy (c,z) were (15,0) instead of (20, A and why does = 


this make sense? 


уһ 


. 12. Review Case Study on Patent Race "PR зе. йы 


PON 10. ih Шау 79075 


І will be asking a question about the Patent Race Game so make sure you understand 
5. Jt, especially how 10, ,determine-hest. responses for each sub-game and how to put the sub- 


(fL dx into the ovérall: game. 
г Tb. da сан, 

А XN vo " ( ^ de Р 11 Г] x 4 t Ps Ж. рч | : 

af дын) : Й “аты | T { | i CU Saad, 
| pi mm * А а Жы i = Faw ie uM s к. " "а а 

ae Ll ыз ae ‘Genetig конь 43 i vë We A AEN VAS г ым жық 
F Taa WT ге | д, уа 5; * а : a | 

LP. ан ДА У „э, У Y ~ F Sato buy - di. Pe Чч т: 


E- LUE. Cmn пила. ИЕЫ ЛЫ, ПОРЫ 


In e GA bach to discovering good rules for the Repeated Prisoner's Dilemma Pp) 
Game, you select a history of 2 and let the GA work for 25 generations. You expect 
something like Tit-for-Tat to emerge in this game. Give an example of the Tit-for-Tat rule 
you expect to find in the final population (the rule should look something like the following: 
CCCC CCCD DDCD DDCD 

but it won't be exactly this). 
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